Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standalone python scripts for canned analysis #153

Open
wants to merge 45 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
51afbb5
add examples directory and initial docs
slabasan Apr 23, 2024
1423748
Added stacked line graph example script
May 6, 2024
f70d3fa
reformatting to attempt to fix linter issue
May 6, 2024
5c96bf0
more linter
Thionazin May 6, 2024
b6c0d35
changed filter operation specification
Thionazin May 15, 2024
4b52f7e
fixed correct file
Thionazin May 15, 2024
e95e1fe
Added erro handling, help information, renamed variables and function…
Thionazin May 22, 2024
6a7b82c
optional graph customization options
Thionazin May 22, 2024
43ec667
error fixing
Thionazin May 22, 2024
2fe1c27
more error fixing
Thionazin May 22, 2024
0fe490d
added deep copy version as a seperate file
Thionazin May 22, 2024
0fcddee
changed metric of interest to be specified by user; fixed bug with pe…
Thionazin May 29, 2024
6533a74
Update examples/python_scripts/stacked_line_graphs.py
Thionazin May 23, 2024
d388626
Refactored names to better suit convention; Added output file name sp…
Thionazin May 31, 2024
cbdc405
updated chart output type selection
Thionazin May 31, 2024
26a329e
added interface for viewing valid metadata and metrics
Thionazin May 31, 2024
6500fb7
added x axis scaling option
Thionazin Jun 15, 2024
102e5ef
changed node name grouping to be optional
Thionazin Jun 15, 2024
f5f38a6
additional plotting options
Thionazin Jun 24, 2024
546d5f3
reformatted
Thionazin Jul 10, 2024
8666761
Update docs/examples.rst
Thionazin Jul 10, 2024
962c7c4
added kripke scaling studies charts
Thionazin Jul 29, 2024
c859d36
reformatted with black
Thionazin Jul 31, 2024
117f486
reformatted metadata script
Thionazin Jul 31, 2024
30f08fd
removed old file
Thionazin Jul 31, 2024
69f1c59
fixed imports
Thionazin Jul 31, 2024
e346242
fixed imports in metadata script
Thionazin Jul 31, 2024
3970a59
more reformatting
Thionazin Jul 31, 2024
1448e5e
updated docs
Thionazin Aug 2, 2024
75c6fa3
formatting fix
Thionazin Aug 2, 2024
50ae99b
updated charts on docs
Thionazin Aug 2, 2024
eb6f377
added docs argument table
Thionazin Aug 2, 2024
31a4608
fixed table formatting
Thionazin Aug 2, 2024
0b8fd26
Updated table of arguments
Thionazin Aug 2, 2024
8ee5fce
Add figsize fontsize arguements. set kwargs in function calls
michaelmckinsey1 Oct 10, 2024
1708422
Add 10th color
michaelmckinsey1 Oct 10, 2024
0736cec
Black
michaelmckinsey1 Oct 10, 2024
63a6623
Refactor
michaelmckinsey1 Oct 10, 2024
4f4b664
update examples.rst
michaelmckinsey1 Oct 10, 2024
f477426
Remove unused file
michaelmckinsey1 Oct 10, 2024
6a5a459
Update scripts
michaelmckinsey1 Oct 10, 2024
54b2513
Rename rst
michaelmckinsey1 Oct 10, 2024
d4a71c7
Adjust setting xticks
michaelmckinsey1 Oct 10, 2024
3128aa9
Update index
michaelmckinsey1 Oct 10, 2024
ad93af8
Add comments
michaelmckinsey1 Oct 11, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
97 changes: 97 additions & 0 deletions docs/analysis_examples.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
..
Copyright 2022 Lawrence Livermore National Security, LLC and other
Thicket Project Developers. See the top-level LICENSE file for details.

SPDX-License-Identifier: MIT

#####################################
1. Stacked Charts for Scaling Studies
#####################################

Thicket can be used to help display the scaling behavior of an application.
In thicket/examples/python_scripts/ the python scripts provide examples
for how to generate stacked line charts.
The script is intended to help generate visualizations of scaling studies
using Caliper and Thicket.
It outputs a stacked line chart of Caliper node runtimes, either by
percentage or by total run time.

Running the Script:
*******************

.. code:: console

$ python stacked_line_charts.py <arguments>

Script Arguments:
*****************
.. list-table:: Table of Arguments
:widths: 50 50
:header-rows: 1

* - Argument
- Description
* - --input_files
- Str: Required. Directory of Caliper file input, including all subdirectories.
* - --x_axis_unique_metadata
- Str: Required. Parameter that is varied during the experiment.
* - --chart_type
- Str: Required. Specify type of output chart. Choices: "percentage_time" | "total_time".
* - --x_axis_log_scaling_base
- Int: Optional. Logarithmic scaling base value for x-axis on chart. Default is -1 for linear scaling.
* - --y_axis_metric
- Str: Optional. Metric to be visualized. Default is "Avg time/rank (exc)".
* - --filter_nodes_name_prefix
- Str: Optional. Filters only entries with prefix to be included in the chart.
* - --group_nodes_name
- Bool: Optional. Specify if nodes with the same name are combined. Default is True.
* - --top_n_nodes
- Int: Optional. Filters only top n longest time entries to be included in the chart. Default is -1 (no filter).
* - --chart_title
- Str: Optional. Title of the output chart. Default is "Application Runtime Components".
* - --chart_xlabel
- Str: Optional. X-axis label of the chart.
* - --chart_ylabel
- Str: Optional. Y-axis label of the chart.
* - --chart_file_name
- Str: Optional. Output chart file name. Default is "stacked_line_chart".
* - --chart_figsize
- List of Ints: Optional. Size of the output chart (xdim, ydim). Example: `--chart_figsize 10 5`.
* - --chart_fontsize
- Int: Optional. Font size of the output chart.


Kripke Example Output Charts:
*****************************

.. code:: console

$ python stacked_line_charts.py --input_files "workspace/experiments/kripke/kripke/kripke_cuda_strong*" --x_axis_unique_metadata mpi.world.size --y_axis_metric "Avg time/rank (exc)" --chart_type percentage_time --chart_title "Kripke on Lassen (Strong Scaling)" --chart_file_name kripke_cuda_strong_perc --chart_ylabel "Percentage of Runtime for Average Time (exc)" --x_axis_log_scaling_base 2 --top_n_nodes 10

.. figure:: images/kripke_cuda_strong_perc.png
:width: 800
:align: center

.. code:: console

$ python stacked_line_charts.py --input_files "workspace/experiments/kripke/kripke/kripke_cuda_strong*" --x_axis_unique_metadata mpi.world.size --y_axis_metric "Avg time/rank (exc)" --chart_type total_time --chart_title "Kripke on Lassen (Strong Scaling)" --chart_file_name kripke_cuda_strong_tot --chart_ylabel "Runtime for Average Time (exc)" --x_axis_log_scaling_base 2 --top_n_nodes 10

.. figure:: images/kripke_cuda_strong_tot.png
:width: 800
:align: center

.. code:: console

$ python stacked_line_charts.py --input_files "workspace/experiments/kripke/kripke/kripke_cuda_weak*" --x_axis_unique_metadata zones --y_axis_metric "Avg time/rank (exc)" --chart_type percentage_time --chart_title "Kripke on Lassen (Weak Scaling)" --chart_file_name kripke_cuda_weak_perc --chart_ylabel "Percentage of Runtime for Average Time (exc)" --x_axis_log_scaling_base 2 --top_n_nodes 10

.. figure:: images/kripke_cuda_weak_perc.png
:width: 800
:align: center

.. code:: console

$ python stacked_line_charts.py --input_files "workspace/experiments/kripke/kripke/kripke_cuda_weak*" --x_axis_unique_metadata zones --y_axis_metric "Avg time/rank (exc)" --chart_type total_time --chart_title "Kripke on Lassen (Weak Scaling)" --chart_file_name kripke_cuda_weak_total --chart_ylabel "Runtime for Average Time (exc)" --x_axis_log_scaling_base 2 --top_n_nodes 10

.. figure:: images/kripke_cuda_weak_total.png
:width: 800
:align: center
Binary file added docs/images/kripke_cuda_strong_perc.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/kripke_cuda_strong_tot.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/kripke_cuda_weak_perc.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/kripke_cuda_weak_total.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ If you are new to thicket and want to start using it, see :doc:`Getting Started
getting_started
user_guide
generating_data
analysis_examples

If you encounter bugs while using thicket, you can report them by opening an issue on
`GitHub <http://github.com/llnl/thicket/issues>`_.
Expand Down
7 changes: 7 additions & 0 deletions examples/python_scripts/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Ignore everything in this directory
*
# Except this file
!.gitignore
!stacked_line_charts.py
!stacked_line_graphs_deepcopy.py
!display_valid_metadata_metric.py
31 changes: 31 additions & 0 deletions examples/python_scripts/display_valid_metadata_metric.py
michaelmckinsey1 marked this conversation as resolved.
Show resolved Hide resolved
dyokelson marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
from glob import glob
import sys
import thicket as th

sys.path.append("/usr/gapps/spot/dev/hatchet-venv/x86_64/lib/python3.9/site-packages/")
sys.path.append("/usr/gapps/spot/dev/hatchet/x86_64/")
sys.path.append("/usr/gapps/spot/dev/thicket-playground-dev/")


usage_str = "Please provide a directory of Caliper files\nUsage: python display_valid_metadata_metric.py <caliper_files_directory>"


def display_valid_metrics_metadata():
tk = th.Thicket.from_caliperreader(glob(sys.argv[1] + "/**/*.cali", recursive=True))

print("Valid metadata values:\n")
for value in tk.metadata.columns:
print("\t" + value)

print("\n" + "-" * 30 + "\n")

print("Valid metric values:\n")
for value in tk.performance_cols:
print("\t" + value)


if __name__ == "__main__":
if len(sys.argv) < 2:
print(usage_str)
exit()
display_valid_metrics_metadata()
233 changes: 233 additions & 0 deletions examples/python_scripts/stacked_line_charts.py
dyokelson marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1,233 @@
import argparse
from glob import glob
import re
import sys
import matplotlib.pyplot as plt
import matplotlib as mpl
import thicket as th

sys.path.append("/usr/gapps/spot/dev/hatchet-venv/x86_64/lib/python3.9/site-packages/")
sys.path.append("/usr/gapps/spot/dev/hatchet/x86_64/")
sys.path.append("/usr/gapps/spot/dev/thicket-playground-dev/")


def arg_parse():
parser = argparse.ArgumentParser(
prog="stacked_line_charts.py",
description="Generate stacked line charts from Caliper files.",
epilog="This script reads in Caliper files and generates stacked line charts based on the specified parameters.",
)
parser.add_argument(
"--input_files",
required=True,
type=str,
help="Directory of Caliper file input, including all subdirectories.",
)
parser.add_argument(
"--x_axis_unique_metadata",
required=True,
type=str,
help="Parameter that is varied during the experiment.",
)
parser.add_argument(
"--chart_type",
required=True,
choices=["percentage_time", "total_time"],
type=str,
help="Specify type of output chart.",
)
parser.add_argument(
"--x_axis_log_scaling_base",
default=-1,
type=int,
help="logarithmic scaling base value for x axis on chart. Default is linear scaling.",
)
parser.add_argument(
"--y_axis_metric",
default="Avg time/rank (exc)",
type=str,
help="Metric to be visualized.",
)
parser.add_argument(
"--filter_nodes_name_prefix",
default="",
type=str,
help="Optional: Filters only entries with prefix to be included in chart.",
)
parser.add_argument(
"--group_nodes_name",
default=True,
type=bool,
help="Optional: Specify if nodes with the same name are combined or not.",
)
parser.add_argument(
"--top_n_nodes",
default=-1,
type=int,
help="Optional: Filters only top n longest time entries to be included in chart.",
)
parser.add_argument(
"--chart_title",
default="Application Runtime Components",
type=str,
help="Optional: Title of the output chart.",
)
parser.add_argument(
"--chart_xlabel",
type=str,
help="Optional: X Label of chart.",
)
parser.add_argument(
"--chart_ylabel",
type=str,
help="Optional: Y Label of chart.",
)
parser.add_argument(
"--chart_file_name",
default="stacked_line_chart",
type=str,
help="Optional: Output chart file name.",
)
parser.add_argument(
"--chart_figsize",
nargs="+",
type=int,
help="Optional: Size of the output chart (xdim, ydim). Ex: --chart_figsize 10 5",
)
parser.add_argument(
"--chart_fontsize",
type=int,
help="Optional: Font size of the output chart.",
)
args = parser.parse_args()
return args


def make_stacked_line_chart(df, chart_type, x_axis, y_axis_metric, **kwargs):
if chart_type == "percentage_time":
value = "perc"
y_label = (
kwargs["chart_ylabel"]
if kwargs["chart_ylabel"]
else "Percentage " + y_axis_metric
)
elif chart_type == "total_time":
value = "Total time"
y_label = (
kwargs["chart_ylabel"]
if kwargs["chart_ylabel"]
else "Total " + y_axis_metric
)
else:
raise ValueError(
"Invalid chart_type value. Please choose from 'percentage_time' or 'total_time'."
)

df.to_csv(kwargs["chart_file_name"] + ".csv")

tdf = df[[(i, value) for i in x_axis]].T
tdf.index = [int(re.sub(r"\D", "", str(item))) for item in tdf.index]

# Hard coded color map
color = [
"#00FFFF",
"#ff7f00",
"#4daf4a",
"#f781bf",
"#a65628",
"#984ea3",
"#999999",
"#e41a1c",
"#dede00",
"#377eb8",
]
mpl.rcParams["axes.prop_cycle"] = mpl.cycler(color=color)

# Set font size of text
if kwargs["chart_fontsize"]:
mpl.rcParams.update({"font.size": kwargs["chart_fontsize"]})

# Plotting
fig, ax = plt.subplots()
tdf.plot(
kind="area",
title=kwargs["chart_title"],
xlabel=kwargs["chart_xlabel"],
ylabel=y_label,
figsize=tuple(kwargs["chart_figsize"]) if kwargs["chart_figsize"] else (10, 5),
ax=ax,
)

# Set scaling of x-axis
if kwargs["x_axis_log_scaling_base"] != -1:
ax.set_xscale("log", base=kwargs["x_axis_log_scaling_base"])
else:
ax.set_xticks(tdf.index)

# Reverse legend order
handles, labels = ax.get_legend_handles_labels()
ax.legend(reversed(handles), reversed(labels), bbox_to_anchor=(1.1, 1.05))

# Try to fix xlabel spacing automatically
fig.autofmt_xdate()

plt.tight_layout()
plt.savefig(kwargs["chart_file_name"] + ".png")


def process_thickets(
input_files,
x_axis_unique_metadata,
y_axis_metric,
filter_nodes_name_prefix,
top_n_nodes,
chart_type,
**additional_args,
):

tk = th.Thicket.from_caliperreader(glob(input_files + "/**/*.cali", recursive=True))

f = open(additional_args["chart_file_name"] + ".txt", "w")
f.write(tk.tree(metric_column=y_axis_metric))
f.close()

gb = tk.groupby(x_axis_unique_metadata)

thickets = list(gb.values())
x_axis = list(gb.keys())
ctk = th.Thicket.concat_thickets(
thickets=thickets,
headers=x_axis,
axis="columns",
)

if additional_args["group_nodes_name"]:
ctk.dataframe = ctk.dataframe.groupby("name").sum()

for i in x_axis:
ctk.dataframe[i, "perc"] = (
ctk.dataframe[i, y_axis_metric] / ctk.dataframe[i, y_axis_metric].sum()
) * 100

if filter_nodes_name_prefix != "":
ctk.dataframe = ctk.dataframe.filter(like=filter_nodes_name_prefix, axis=0)

if top_n_nodes != -1:
ctk.dataframe = ctk.dataframe.nlargest(top_n_nodes, [(x_axis[0], "Total time")])

# Set default label to x_axis_unique_metadata if not provided
if not additional_args["chart_xlabel"]:
additional_args["chart_xlabel"] = x_axis_unique_metadata

make_stacked_line_chart(
df=ctk.dataframe,
chart_type=chart_type,
x_axis=x_axis,
y_axis_metric=y_axis_metric,
**additional_args,
)


if __name__ == "__main__":
args = arg_parse()
process_thickets(**vars(args))
Loading