Skip to content

Commit

Permalink
[SPARK-46126][PYTHON][TESTS] Fix the doctest in pyspark.pandas.frame.…
Browse files Browse the repository at this point in the history
…DataFrame.to_dict (Python 3.12)

### What changes were proposed in this pull request?

This PR proposes to fix doctest, `pyspark.pandas.frame.DataFrame.to_dict`, compatible with Python 3.12.

```
File "/__w/spark/spark/python/pyspark/pandas/frame.py", line 2515, in pyspark.pandas.frame.DataFrame.to_dict
Failed example:
    df.to_dict(into=OrderedDict)
Expected:
    OrderedDict([('col1', OrderedDict([('row1', 1), ('row2', 2)])), ('col2', OrderedDict([('row1', 0.5), ('row2', 0.75)]))])
Got:
    OrderedDict({'col1': OrderedDict({'row1': 1, 'row2': 2}), 'col2': OrderedDict({'row1': 0.5, 'row2': 0.75})})
```

### Why are the changes needed?

For the proper test for Python 3.12. It is failing, see https://github.com/apache/spark/actions/runs/7006848931/job/19059702970

### Does this PR introduce _any_ user-facing change?

No. A bit of user-facing doc change but very trival.

### How was this patch tested?

Fixed unittests. Manually tested via:

```bash
python/run-tests --python-executable=python3  --testnames 'pyspark.pandas.frame'
...
Tests passed in 721 seconds
```

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#44042 from HyukjinKwon/SPARK-46126.

Authored-by: Hyukjin Kwon <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
  • Loading branch information
HyukjinKwon committed Nov 28, 2023
1 parent ec7d07c commit 4f59e1b
Showing 1 changed file with 2 additions and 3 deletions.
5 changes: 2 additions & 3 deletions python/pyspark/pandas/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -2512,9 +2512,8 @@ def to_dict(self, orient: str = "dict", into: Type = dict) -> Union[List, Mappin
You can also specify the mapping type.
>>> from collections import OrderedDict, defaultdict
>>> df.to_dict(into=OrderedDict)
OrderedDict([('col1', OrderedDict([('row1', 1), ('row2', 2)])), \
('col2', OrderedDict([('row1', 0.5), ('row2', 0.75)]))])
>>> df.to_dict(into=OrderedDict) # doctest: +ELLIPSIS
OrderedDict(...)
If you want a `defaultdict`, you need to initialize it:
Expand Down

0 comments on commit 4f59e1b

Please sign in to comment.