Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump pandas from 1.5.3 to the latest stable version #422

Merged
merged 102 commits into from
Jan 6, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
102 commits
Select commit Hold shift + click to select a range
fb4f0d6
Bump pandas from 1.5.3 to 2.0.3
Yerzhaisang Oct 29, 2024
06076d9
Updated CHANGELOG
Yerzhaisang Oct 29, 2024
3ef1e2c
Updated CHANGELOG
Yerzhaisang Nov 3, 2024
da14ee5
updated built-in method
Yerzhaisang Nov 4, 2024
c68b7f8
removed unused comment
Yerzhaisang Nov 4, 2024
0084cce
Implement to_list_if_needed method for list conversion
Yerzhaisang Nov 4, 2024
12bacb7
Refactor MAD calculation using CustomFunctionDispatcher for improved …
Yerzhaisang Nov 4, 2024
77ebf1c
Refactor MAD calculation using CustomFunctionDispatcher for improved …
Yerzhaisang Nov 4, 2024
45e793a
Refactor MAD calculation using CustomFunctionDispatcher for improved …
Yerzhaisang Nov 4, 2024
c4455e0
Refactor MAD calculation using CustomFunctionDispatcher for improved …
Yerzhaisang Nov 4, 2024
7135507
refactor: move metric identifiers to constants.py for readability
Yerzhaisang Nov 4, 2024
d2203be
refactor: move metric identifiers to constants.py for readability
Yerzhaisang Nov 4, 2024
c55639d
Removed unused line
Yerzhaisang Nov 5, 2024
b1f2319
clarify build_pd_series docstring
Yerzhaisang Nov 5, 2024
77c56a7
save constansts in utils.py
Yerzhaisang Nov 5, 2024
25905b6
fiexd CI
Yerzhaisang Nov 5, 2024
feeb0b5
added keyword argument to to_csv method
Yerzhaisang Nov 5, 2024
1162608
added comment to describe method
Yerzhaisang Nov 5, 2024
dce7c74
possible pandas versions
Yerzhaisang Dec 3, 2024
cd5ff41
upgrading pandas
Yerzhaisang Dec 9, 2024
87411e3
upgrading pandas
Yerzhaisang Dec 9, 2024
8665e59
wide range of pandas versions
Yerzhaisang Dec 9, 2024
a02af31
test
Yerzhaisang Dec 10, 2024
101ead9
test commit
Yerzhaisang Dec 15, 2024
7b1d24d
Fixed doctest dtypes
Yerzhaisang Dec 23, 2024
25cf1b5
Fixed doctest dtypes
Yerzhaisang Dec 23, 2024
8bfe735
Fixed datetime dtypes
Yerzhaisang Dec 27, 2024
f7746cb
Fixed datetime dtypes
Yerzhaisang Dec 27, 2024
cf9363a
testing another version
Yerzhaisang Dec 27, 2024
ce6554c
Fixed datetime dtypes
Yerzhaisang Dec 27, 2024
ff469e2
testing another version
Yerzhaisang Dec 27, 2024
c58cc67
Fixed datetime dtypes
Yerzhaisang Dec 27, 2024
8bf6835
Fixed datetime dtypes
Yerzhaisang Dec 27, 2024
6485228
Fixed datetime dtypes
Yerzhaisang Dec 27, 2024
dc4a5bf
testing another version
Yerzhaisang Dec 27, 2024
612f262
Fixed testing issues
Yerzhaisang Dec 28, 2024
83c0a24
Fixed testing issues
Yerzhaisang Dec 28, 2024
4f2d71a
testing another version
Yerzhaisang Dec 28, 2024
563488c
testing another version
Yerzhaisang Dec 28, 2024
1589de9
Fixed testing issues
Yerzhaisang Dec 28, 2024
36570c1
testing another version
Yerzhaisang Dec 28, 2024
2cc883d
testing another version
Yerzhaisang Dec 28, 2024
b6f3433
testing another version
Yerzhaisang Dec 28, 2024
5bfb86f
testing another version
Yerzhaisang Dec 28, 2024
f3337bf
testing another version
Yerzhaisang Dec 28, 2024
41f7b17
testing another version
Yerzhaisang Dec 28, 2024
77e4326
testing another version
Yerzhaisang Dec 28, 2024
78e530d
testing another version
Yerzhaisang Dec 28, 2024
ebea62e
testing another version
Yerzhaisang Dec 28, 2024
c073e96
testing another version
Yerzhaisang Dec 28, 2024
fdaf8df
testing another version
Yerzhaisang Dec 28, 2024
427a4aa
testing another version
Yerzhaisang Dec 28, 2024
20ec7e4
testing final pandas version range
Yerzhaisang Dec 28, 2024
f4aa0df
Merge branch 'dev' into dev_test
Yerzhaisang Dec 28, 2024
707eb1f
updating CHANGELOG
Yerzhaisang Dec 28, 2024
3bf0816
improving test coverage
Yerzhaisang Dec 28, 2024
3230f1a
Merge branch 'main' into dev
Yerzhaisang Jan 2, 2025
6b4f3a7
adapt tests for backward compatability
Yerzhaisang Jan 2, 2025
e53a0e3
adapt tests for backward compatability
Yerzhaisang Jan 2, 2025
deecdbc
adapt tests for backward compatability
Yerzhaisang Jan 2, 2025
a386bf4
adapt tests for backward compatability
Yerzhaisang Jan 2, 2025
8bf7df9
adapt tests for backward compatability
Yerzhaisang Jan 2, 2025
11aecc5
adapt tests for backward compatability
Yerzhaisang Jan 2, 2025
3f1834d
adapt tests for backward compatability
Yerzhaisang Jan 2, 2025
cc0d810
adapt tests for backward compatability
Yerzhaisang Jan 2, 2025
3bdfb6a
adapt tests for backward compatability
Yerzhaisang Jan 2, 2025
87ec2ff
adapt tests for backward compatability
Yerzhaisang Jan 2, 2025
d45da1a
adapt tests for backward compatability
Yerzhaisang Jan 2, 2025
49198f1
adapt tests for backward compatability
Yerzhaisang Jan 3, 2025
73bc70c
adapt tests for backward compatability
Yerzhaisang Jan 3, 2025
e9ec7fc
adapt tests for backward compatability
Yerzhaisang Jan 3, 2025
c9be924
adapt tests for backward compatability
Yerzhaisang Jan 3, 2025
b20c31c
adapt tests for backward compatability
Yerzhaisang Jan 3, 2025
c26e5e1
adapt tests for backward compatability
Yerzhaisang Jan 3, 2025
7522433
adapt tests for backward compatability
Yerzhaisang Jan 3, 2025
93f491a
adapt tests for backward compatability
Yerzhaisang Jan 3, 2025
c4ca7ee
rerun tests
Yerzhaisang Jan 3, 2025
be44c48
rerun tests
Yerzhaisang Jan 3, 2025
1c9159c
pandas 2.0.0
Yerzhaisang Jan 3, 2025
d5f365c
applied to_string and strip methods
Yerzhaisang Jan 3, 2025
6d1dd9e
applied to_string and strip methods
Yerzhaisang Jan 3, 2025
4fc4a88
applied to_string and strip methods
Yerzhaisang Jan 3, 2025
933498b
applied to_string and strip methods
Yerzhaisang Jan 3, 2025
cfa0e3d
applied to_string and strip methods
Yerzhaisang Jan 3, 2025
1d4691b
applied to_string and strip methods
Yerzhaisang Jan 4, 2025
240a902
applied to_string and strip methods
Yerzhaisang Jan 4, 2025
209a646
testing pandas 1.5.2
Yerzhaisang Jan 4, 2025
0efc20d
testing pandas 1.5.3
Yerzhaisang Jan 4, 2025
2e516f2
testing pandas 2.0.0
Yerzhaisang Jan 4, 2025
5129800
testing pandas 2.0.1
Yerzhaisang Jan 4, 2025
9f67ddb
testing pandas 2.0.2
Yerzhaisang Jan 4, 2025
e77e8c5
testing pandas 2.0.3
Yerzhaisang Jan 4, 2025
df05824
testing pandas 2.1.1
Yerzhaisang Jan 4, 2025
ad535f1
testing pandas 2.1.2
Yerzhaisang Jan 4, 2025
02b7fa2
testing pandas 2.1.3
Yerzhaisang Jan 4, 2025
c30064c
testing pandas 2.1.4
Yerzhaisang Jan 4, 2025
31b12f6
testing pandas 2.2.0
Yerzhaisang Jan 4, 2025
3bfa8b7
testing pandas 2.2.1
Yerzhaisang Jan 4, 2025
55faaf6
testing pandas 2.2.2
Yerzhaisang Jan 4, 2025
1ec27c2
testing pandas 2.2.3
Yerzhaisang Jan 4, 2025
88d38cd
fixing pandas version
Yerzhaisang Jan 4, 2025
d959396
fixing pandas version
Yerzhaisang Jan 4, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,8 @@ Inspired from [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
- Update model upload history - opensearch-project/opensearch-neural-sparse-encoding-doc-v2-mini (v.1.0.0)(TORCH_SCRIPT) by @dhrubo-os ([#417](https://github.com/opensearch-project/opensearch-py-ml/pull/417))
- Update model upload history - opensearch-project/opensearch-neural-sparse-encoding-v2-distill (v.1.0.0)(TORCH_SCRIPT) by @dhrubo-os ([#419](https://github.com/opensearch-project/opensearch-py-ml/pull/419))
- Upgrade GitHub Actions workflows to use `@v4` to prevent deprecation issues with `@v3` by @yerzhaisang ([#428](https://github.com/opensearch-project/opensearch-py-ml/pull/428))
- Bump pandas from 1.5.3 to the latest stable version by @yerzhaisang ([#422](https://github.com/opensearch-project/opensearch-py-ml/pull/422))


### Fixed
- Fix the wrong final zip file name in model_uploader workflow, now will name it by the upload_prefix alse.([#413](https://github.com/opensearch-project/opensearch-py-ml/pull/413/files))
Expand Down
2 changes: 1 addition & 1 deletion docs/requirements-docs.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
opensearch-py>=2
pandas>=1.5,<3
pandas>=1.5.2,<2.3,!=2.1.0
matplotlib>=3.6.0,<4
nbval
sphinx
Expand Down
25 changes: 22 additions & 3 deletions opensearch_py_ml/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -55,14 +55,33 @@


def build_pd_series(
data: Dict[str, Any], dtype: Optional["DTypeLike"] = None, **kwargs: Any
data: Dict[str, Any],
dtype: Optional["DTypeLike"] = None,
index_name: Optional[str] = None,
**kwargs: Any,
) -> pd.Series:
"""Builds a pd.Series while squelching the warning
for unspecified dtype on empty series
"""
Builds a pandas Series from a dictionary, optionally setting an index name.

Parameters:
data : Dict[str, Any]
The data to build the Series from, with keys as the index.
dtype : Optional[DTypeLike]
The desired data type of the Series. If not specified, uses EMPTY_SERIES_DTYPE if data is empty.
index_name : Optional[str]
Name to assign to the Series index, similar to `index_name` in `value_counts`.

Returns:
pd.Series
A pandas Series constructed from the given data, with the specified dtype and index name.
"""

dtype = dtype or (EMPTY_SERIES_DTYPE if not data else dtype)
if dtype is not None:
kwargs["dtype"] = dtype
if index_name is not None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a comment why do we need this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

index = pd.Index(data.keys(), name=index_name)
kwargs["index"] = index
return pd.Series(data, **kwargs)


Expand Down
47 changes: 25 additions & 22 deletions opensearch_py_ml/dataframe.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,14 +47,17 @@
from opensearch_py_ml.groupby import DataFrameGroupBy
from opensearch_py_ml.ndframe import NDFrame
from opensearch_py_ml.series import Series
from opensearch_py_ml.utils import is_valid_attr_name
from opensearch_py_ml.utils import is_valid_attr_name, to_list_if_needed

if TYPE_CHECKING:
from opensearchpy import OpenSearch

from .query_compiler import QueryCompiler


PANDAS_MAJOR_VERSION = int(pd.__version__.split(".")[0])


class DataFrame(NDFrame):
"""
Two-dimensional size-mutable, potentially heterogeneous tabular data structure with labeled axes
Expand Down Expand Up @@ -275,22 +278,13 @@ def tail(self, n: int = 5) -> "DataFrame":
>>> from tests import OPENSEARCH_TEST_CLIENT

>>> df = oml.DataFrame(OPENSEARCH_TEST_CLIENT, 'flights', columns=['Origin', 'Dest'])
>>> df.tail()
Origin \\
13054 Pisa International Airport...
13055 Winnipeg / James Armstrong Richardson International Airport...
13056 Licenciado Benito Juarez International Airport...
13057 Itami Airport...
13058 Adelaide International Airport...
<BLANKLINE>
Dest...
13054 Xi'an Xianyang International Airport...
13055 Zurich Airport...
13056 Ukrainka Air Base...
13057 Ministro Pistarini International Airport...
13058 Washington Dulles International Airport...
<BLANKLINE>
[5 rows x 2 columns]
>>> print(df.tail().to_string().strip())
Origin Dest
13054 Pisa International Airport Xi'an Xianyang International Airport
13055 Winnipeg / James Armstrong Richardson International Airport Zurich Airport
13056 Licenciado Benito Juarez International Airport Ukrainka Air Base
13057 Itami Airport Ministro Pistarini International Airport
13058 Adelaide International Airport Washington Dulles International Airport
"""
return DataFrame(_query_compiler=self._query_compiler.tail(n))

Expand Down Expand Up @@ -424,9 +418,14 @@ def drop(
axis = pd.DataFrame._get_axis_name(axis)
axes = {axis: labels}
elif index is not None or columns is not None:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kind of confused here the parent branch is checking that if one of them is not None but inside its checking again
Line 431 and 440
maybe this could simplified to what @pyek-bot stated about creating a wrapper for convertToList if needed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

axes, _ = pd.DataFrame()._construct_axes_from_arguments(
(index, columns), {}
)
axes = {
"index": to_list_if_needed(index),
"columns": (
pd.Index(to_list_if_needed(columns))
if columns is not None
else None
Comment on lines +424 to +426

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that in the implementation of

opensearch_py_ml.utils that when the value is None that you would return None already maybe we dont need a ternary operation here since its already doing that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey Brian,

We can’t remove the ternary operation at this point because calling pd.Index(None) results in a TypeError. Additionally, we need to keep this check since it’s required here.

),
}
else:
raise ValueError(
"Need to specify at least one of 'labels', 'index' or 'columns'"
Expand All @@ -440,7 +439,7 @@ def drop(
axes["index"] = [axes["index"]]
if errors == "raise":
# Check if axes['index'] values exists in index
count = self._query_compiler._index_matches_count(axes["index"])
count = self._query_compiler._index_matches_count(list(axes["index"]))
if count != len(axes["index"]):
raise ValueError(
f"number of labels {count}!={len(axes['index'])} not contained in axis"
Expand Down Expand Up @@ -1341,6 +1340,10 @@ def to_csv(
--------
:pandas_api_docs:`pandas.DataFrame.to_csv`
"""
if PANDAS_MAJOR_VERSION < 2:
line_terminator_keyword = "line_terminator"
else:
line_terminator_keyword = "lineterminator"
kwargs = {
"path_or_buf": path_or_buf,
"sep": sep,
Expand All @@ -1355,7 +1358,7 @@ def to_csv(
"compression": compression,
"quoting": quoting,
"quotechar": quotechar,
"line_terminator": line_terminator,
line_terminator_keyword: line_terminator,
"chunksize": chunksize,
"date_format": date_format,
"doublequote": doublequote,
Expand Down
1 change: 1 addition & 0 deletions opensearch_py_ml/etl.py
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,7 @@ def pandas_to_opensearch(
... 'G': [1, 2, 3],
... 'H': 'Long text - to be indexed as os type text'},
... index=['0', '1', '2'])
>>> pd_df['D'] = pd_df['D'].astype('datetime64[ns]')
>>> type(pd_df)
<class 'pandas.core.frame.DataFrame'>
>>> pd_df
Expand Down
7 changes: 4 additions & 3 deletions opensearch_py_ml/groupby.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@
from typing import TYPE_CHECKING, List, Optional, Union

from opensearch_py_ml.query_compiler import QueryCompiler
from opensearch_py_ml.utils import MEAN_ABSOLUTE_DEVIATION, STANDARD_DEVIATION, VARIANCE

if TYPE_CHECKING:
import pandas as pd # type: ignore
Expand Down Expand Up @@ -153,7 +154,7 @@ def var(self, numeric_only: bool = True) -> "pd.DataFrame":
"""
return self._query_compiler.aggs_groupby(
by=self._by,
pd_aggs=["var"],
pd_aggs=[VARIANCE],
dropna=self._dropna,
numeric_only=numeric_only,
)
Expand Down Expand Up @@ -206,7 +207,7 @@ def std(self, numeric_only: bool = True) -> "pd.DataFrame":
"""
return self._query_compiler.aggs_groupby(
by=self._by,
pd_aggs=["std"],
pd_aggs=[STANDARD_DEVIATION],
dropna=self._dropna,
numeric_only=numeric_only,
)
Expand Down Expand Up @@ -259,7 +260,7 @@ def mad(self, numeric_only: bool = True) -> "pd.DataFrame":
"""
return self._query_compiler.aggs_groupby(
by=self._by,
pd_aggs=["mad"],
pd_aggs=[MEAN_ABSOLUTE_DEVIATION],
dropna=self._dropna,
numeric_only=numeric_only,
)
Expand Down
61 changes: 51 additions & 10 deletions opensearch_py_ml/operations.py
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,7 @@
SizeTask,
TailTask,
)
from opensearch_py_ml.utils import MEAN_ABSOLUTE_DEVIATION, STANDARD_DEVIATION, VARIANCE

if TYPE_CHECKING:
from numpy.typing import DTypeLike
Expand All @@ -75,6 +76,8 @@
from opensearch_py_ml.query_compiler import QueryCompiler
from opensearch_py_ml.tasks import Task

PANDAS_MAJOR_VERSION = int(pd.__version__.split(".")[0])


class QueryParams:
def __init__(self) -> None:
Expand Down Expand Up @@ -475,7 +478,10 @@ def _terms_aggs(
except IndexError:
name = None

return build_pd_series(results, name=name)
if PANDAS_MAJOR_VERSION < 2:
return build_pd_series(results, name=name)
else:
return build_pd_series(results, index_name=name, name="count")

def _hist_aggs(
self, query_compiler: "QueryCompiler", num_bins: int
Expand Down Expand Up @@ -620,7 +626,7 @@ def _unpack_metric_aggs(
values.append(field.nan_value)
# Explicit condition for mad to add NaN because it doesn't support bool
elif is_dataframe_agg and numeric_only:
if pd_agg == "mad":
if pd_agg == MEAN_ABSOLUTE_DEVIATION:
values.append(field.nan_value)
continue

Expand Down Expand Up @@ -1097,7 +1103,14 @@ def _map_pd_aggs_to_os_aggs(
"""
# pd aggs that will be mapped to os aggs
# that can use 'extended_stats'.
extended_stats_pd_aggs = {"mean", "min", "max", "sum", "var", "std"}
extended_stats_pd_aggs = {
"mean",
"min",
"max",
"sum",
VARIANCE,
STANDARD_DEVIATION,
}
extended_stats_os_aggs = {"avg", "min", "max", "sum"}
extended_stats_calls = 0

Expand All @@ -1117,15 +1130,15 @@ def _map_pd_aggs_to_os_aggs(
os_aggs.append("avg")
elif pd_agg == "sum":
os_aggs.append("sum")
elif pd_agg == "std":
elif pd_agg == STANDARD_DEVIATION:
os_aggs.append(("extended_stats", "std_deviation"))
elif pd_agg == "var":
elif pd_agg == VARIANCE:
os_aggs.append(("extended_stats", "variance"))

# Aggs that aren't 'extended_stats' compatible
elif pd_agg == "nunique":
os_aggs.append("cardinality")
elif pd_agg == "mad":
elif pd_agg == MEAN_ABSOLUTE_DEVIATION:
os_aggs.append("median_absolute_deviation")
elif pd_agg == "median":
os_aggs.append(("percentiles", (50.0,)))
Expand Down Expand Up @@ -1205,7 +1218,7 @@ def describe(self, query_compiler: "QueryCompiler") -> pd.DataFrame:

df1 = self.aggs(
query_compiler=query_compiler,
pd_aggs=["count", "mean", "std", "min", "max"],
pd_aggs=["count", "mean", "min", "max", STANDARD_DEVIATION],
numeric_only=True,
)
df2 = self.quantile(
Expand All @@ -1219,9 +1232,37 @@ def describe(self, query_compiler: "QueryCompiler") -> pd.DataFrame:
# Convert [.25,.5,.75] to ["25%", "50%", "75%"]
df2 = df2.set_index([["25%", "50%", "75%"]])

return pd.concat([df1, df2]).reindex(
["count", "mean", "std", "min", "25%", "50%", "75%", "max"]
)
df = pd.concat([df1, df2])

if PANDAS_MAJOR_VERSION < 2:
return pd.concat([df1, df2]).reindex(
["count", "mean", "std", "min", "25%", "50%", "75%", "max"]
)
else:
# Note: In recent pandas versions, `describe()` returns a different index order
# for one-column DataFrames compared to multi-column DataFrames.
# We adjust the order manually to ensure consistency.
if df.shape[1] == 1:
# For single-column DataFrames, `describe()` typically outputs:
# ["count", "mean", "std", "min", "25%", "50%", "75%", "max"]
return df.reindex(
[
"count",
"mean",
STANDARD_DEVIATION,
"min",
"25%",
"50%",
"75%",
"max",
]
)

# For multi-column DataFrames, `describe()` typically outputs:
# ["count", "mean", "min", "25%", "50%", "75%", "max", "std"]
return df.reindex(
["count", "mean", "min", "25%", "50%", "75%", "max", STANDARD_DEVIATION]
)

def to_pandas(
self, query_compiler: "QueryCompiler", show_progress: bool = False
Expand Down
7 changes: 4 additions & 3 deletions opensearch_py_ml/query_compiler.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@
from opensearch_py_ml.filter import BooleanFilter, QueryFilter
from opensearch_py_ml.index import Index
from opensearch_py_ml.operations import Operations
from opensearch_py_ml.utils import MEAN_ABSOLUTE_DEVIATION, STANDARD_DEVIATION, VARIANCE

if TYPE_CHECKING:
from opensearchpy import OpenSearch
Expand Down Expand Up @@ -587,17 +588,17 @@ def mean(self, numeric_only: Optional[bool] = None) -> pd.Series:

def var(self, numeric_only: Optional[bool] = None) -> pd.Series:
return self._operations._metric_agg_series(
self, ["var"], numeric_only=numeric_only
self, [VARIANCE], numeric_only=numeric_only
)

def std(self, numeric_only: Optional[bool] = None) -> pd.Series:
return self._operations._metric_agg_series(
self, ["std"], numeric_only=numeric_only
self, [STANDARD_DEVIATION], numeric_only=numeric_only
)

def mad(self, numeric_only: Optional[bool] = None) -> pd.Series:
return self._operations._metric_agg_series(
self, ["mad"], numeric_only=numeric_only
self, [MEAN_ABSOLUTE_DEVIATION], numeric_only=numeric_only
)

def median(self, numeric_only: Optional[bool] = None) -> pd.Series:
Expand Down
12 changes: 6 additions & 6 deletions opensearch_py_ml/series.py
Original file line number Diff line number Diff line change
Expand Up @@ -311,12 +311,12 @@ def value_counts(self, os_size: int = 10) -> pd.Series:
>>> from tests import OPENSEARCH_TEST_CLIENT

>>> df = oml.DataFrame(OPENSEARCH_TEST_CLIENT, 'flights')
>>> df['Carrier'].value_counts()
Logstash Airways 3331
JetBeats 3274
Kibana Airlines 3234
ES-Air 3220
Name: Carrier, dtype: int64
>>> for key, value in df['Carrier'].value_counts().items():
... print(key, value)
Logstash Airways 3331
JetBeats 3274
Kibana Airlines 3234
ES-Air 3220
"""
if not isinstance(os_size, int):
raise TypeError("os_size must be a positive integer.")
Expand Down
Loading
Loading