Future work #11

fjahr · 2023-12-17T20:19:32Z

The Kartograf project will be one year old in January. Lately, it feels like it is getting much closer to a stable state, thanks to the help of many testers who have found a lot of issues. I hope that next year the project can graduate from only finding issues and fixing them and grow up to be more of a stable development environment that is inviting to new contributors and nice to work with for the existing ones.

The following priorities are suggestions, feel free to brainstorm in this issue below.

Test framework and coverage: Currently this project has none. I have tried to structure the project in a modular fashion that should be doable but so far I didn't get to it. This is by far the biggest priority IMO because it will make future changes a lot easier and let us ensure that we handle all the many edge cases properly.
Performance: The lowest hanging fruit by far was the parallelization of the CPU-intensive jobs, which was done recently. I am sure there are still tons of potential performance improvements. And since we have parallelization, the runtime varies a lot depending on the specs of the system, it might be great if we could calculate an estimate of the runtime early and inform the user what they need to expect.
CI: we should have some :) basics like linting, of course running the tests once we have them, and also testing the nix files.
Usability improvements for common use cases: While I would like the project to be universally useful, there are some interfaces that could be probably improved. This may include logging for example, where we might want to have a -debug the shows more than we currently do and if that is not set, we may show a lot less for non-dev users. This also touches potentially compressing data (mentioned below) and defaults of flags, i.e. currently we recommend building a map from RPKI, IRR and RV but users need to set -irr and -rv explicitly. Maybe we should have these on by default and rather have -no-irr and -no-rv. While I am sure there are many opportunities for improvement, I am not sure what all the answers are on this one.
Some processing decisions research: We are currently making some small processing decisions where more research could be done to ensure we really doing the best we can do. I usually did some but not so extensive that I would be 100% sure all of these decisions were optimal. Examples: In RPKI, when we see duplicate entries for the same pfx but different ASNs, we decide by valid_until first, then by valid_since and then we take the lower ASN as a deterministic tie-breaker. Also, CAIDA pfx2as files include multi-origin routes and AS-SETs. Expert interviews may be very helpful here.

With the following topics I am currently still undecided if they should be a priority or not:

More sources: Other projects that collected data for ASMap have downloaded raw files from multiple collectors and processed them but ignored RPKI and IRR. It might still be interesting to get collectors data from multiple sources (RIPE, Routeviews, others) and process them ourselves. So far, kartograf downloads the already processed pfx2as mappings from CAIDA/routeviews alone. My rationale for this has been that we are already trusting the collector to not have messed with the data so we might as well let them do the work of processing it for us. And they may even have better knowledge of their proprietary infrastructure that let's them process the data in a better way than we could. But I am happy if someone wants to challenge that and adding more collector data sources should just give more choice which is generally a good thing.
Option to include your own sources: It sound reasonable that some users may be able to get a BGP dump from a source they trust and build a map from that, potentially also mixing it with our data sources. But this would need extensive research if this is actually a use case worth maintaining and if yes how people would imagine it should work.
Compression, intermittent data deletion: Currently the data directory keeps all the downloaded data of a mapping process. This is needed to share that data later to let other reproduce the result. However, to save space maybe we want to compress that data as part of the mapping process (at the end) to save space on the user's system and also configure reproduction runs to accept a compressed file as the root data. In the out dir we save a lot of intermittent data and never delete it, primarily for debugging purposes. We may talk about deleting intermittent files as a standard and only keep them if a debug flag is set for example.

The text was updated successfully, but these errors were encountered:

fjahr · 2023-12-18T17:37:02Z

Two more topics:

We should clarify how to deal with the ARIN TAL RPA situation I described here: Demo using Collaborative Launch feature at 1702994400 (Kartograf 0.4.1) asmap-data#4 (comment)
Sjors mentioned that interrupt behavior could be nicer here: Interrupt could be nicer asmap-data#5 (comment) I would place this within the broad category of usability improvements, but it's good to note it as a specific ToDo. I will add Sjors original description Text below.

fjahr · 2023-12-18T17:38:30Z

Sjors noted in the asmap-data repository:

I did a ctrl+c during the "Validating RPKI" phase. The output looks a bit noisy and it also didn't stop, at least not before I gave up waiting after two minutes and ctrl + c again, which responded quickly.

$ ./run map -w 1702648800

--- Start Kartograf ---

Using rpki-client version 8.5.
Coordinated launch mode: Waiting until 1702648800 (2023-12-15 15:00:00 CET) to launch mapping process.
The epoch for this run is: 1702648800 (2023-12-15 14:00:00 UTC, local: 2023-12-15 15:00:00 CET)

--- Fetching RPKI ---

Downloading RPKI Data
...finished in 0:01:16.535906

--- Validating RPKI ---

Validating RPKI ROAs
^CTraceback (most recent call last):
  File "/home/sjors/dev/kartograf/./run", line 92, in <module>
    Kartograf.map(args)
  File "/home/sjors/dev/kartograf/kartograf/kartograf.py", line 68, in map
    validate_rpki_db(context)
  File "/home/sjors/dev/kartograf/kartograf/timed.py", line 10, in wrapper
    result = func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^
  File "/home/sjors/dev/kartograf/kartograf/rpki/fetch.py", line 38, in validate_rpki_db
    with ThreadPoolExecutor() as executor:
  File "/home/sjors/.pyenv/versions/3.11.7/lib/python3.11/concurrent/futures/_base.py", line 647, in __exit__
    self.shutdown(wait=True)
  File "/home/sjors/.pyenv/versions/3.11.7/lib/python3.11/concurrent/futures/thread.py", line 235, in shutdown
    t.join()
  File "/home/sjors/.pyenv/versions/3.11.7/lib/python3.11/threading.py", line 1119, in join
    self._wait_for_tstate_lock()
  File "/home/sjors/.pyenv/versions/3.11.7/lib/python3.11/threading.py", line 1139, in _wait_for_tstate_lock
    if lock.acquire(block, timeout):
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
KeyboardInterrupt

and

If the user does a ctrl + c, you could optionally ask if they want to delete the relevant out directory.

brunoerg · 2023-12-18T22:17:16Z

Test framework and coverage: Currently this project has none. I have tried to structure the project in a modular fashion that should be doable but so far I didn't get to it. This is by far the biggest priority IMO because it will make future changes a lot easier and let us ensure that we handle all the many edge cases properly.

This is obviously really important and should be priority. Since we've to deal with a large and varied amount of data, fuzz testing would be useful to ensure we won't get any unexpected crash. Also, related to the topic 2, I think some benchmarking would be nice as well.

fjahr · 2023-12-19T12:48:43Z

This is obviously really important and should be priority. Since we've to deal with a large and varied amount of data, fuzz testing would be useful to ensure we won't get any unexpected crash. Also, related to the topic 2, I think some benchmarking would be nice as well.

Good points, yeah, benchmarking would make further work on performance a lot more comfortable.

fjahr · 2023-12-28T20:57:21Z

This is a pretty big change but potentially helpful: We could track the data source for each entry in the process and then output a result file that contains all the sources as a comment at the end of the line. That might be very helpful with debugging diffs if we would like to do that. However, we might also be able with a semi-smart script that greps for the prefixes in the result files and gives a good educated guess where they appeared first.

fjahr mentioned this issue Dec 18, 2023

Interrupt could be nicer asmap/asmap-data#5

Closed

fjahr mentioned this issue Oct 20, 2024

proposal for additional data source -- full real-time BGP table asmap/asmap-data#17

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Future work #11

Future work #11

fjahr commented Dec 17, 2023 •

edited

Loading

fjahr commented Dec 18, 2023

fjahr commented Dec 18, 2023

brunoerg commented Dec 18, 2023

fjahr commented Dec 19, 2023

fjahr commented Dec 28, 2023

Future work #11

Future work #11

Comments

fjahr commented Dec 17, 2023 • edited Loading

fjahr commented Dec 18, 2023

fjahr commented Dec 18, 2023

brunoerg commented Dec 18, 2023

fjahr commented Dec 19, 2023

fjahr commented Dec 28, 2023

fjahr commented Dec 17, 2023 •

edited

Loading