Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add example for using multi-sheet Excel file #1760

Merged
merged 8 commits into from
Aug 18, 2022

Conversation

ankatiyar
Copy link
Contributor

@ankatiyar ankatiyar commented Aug 4, 2022

Description

Resolves #1700

Development notes

Checklist

  • Read the contributing guidelines
  • Opened this PR as a 'Draft Pull Request' if it is work-in-progress
  • Updated the documentation to reflect the code changes
  • Added a description of this change in the RELEASE.md file
  • Added tests to cover my changes

@deepyaman deepyaman changed the title Add example for using multisheet excel file Add example for using multi-sheet Excel file Aug 5, 2022
@@ -164,7 +164,17 @@ rockets:
sheet_name: Sheet1
```

Example 7: Saves an image created with Matplotlib on Google Cloud Storage
Example 7: Loads a multisheet excel file from a local file system
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Example 7: Loads a multisheet excel file from a local file system
Example 7: Loads a multi-sheet Excel file from a local file system

:)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made the change in 3f493a1, thanks!

@ankatiyar ankatiyar force-pushed the multisheet_excel_example branch from aaf3119 to 3f493a1 Compare August 5, 2022 08:18
@ankatiyar ankatiyar marked this pull request as ready for review August 5, 2022 08:28
@ankatiyar ankatiyar requested a review from yetudada as a code owner August 5, 2022 08:28
Copy link
Contributor

@noklam noklam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Congrats on your first PR⭐️

Side notes:
Follow up on previous discussion about DataSet API Docs should be the 1st entry point for people looking up examples, I think we should do the followings:

  1. Remove these number, i.e. Example X, make them a sub-header like ## Save-an-image-xxxxx, so we can link these example directly in the doc, and also better SEO I think?
  2. Only keep some of the example in Data Catalog to showcase what Kedro can do, but leave the advance example in the Corresponding API Doc.
  • We have examples saving CSV on S3, matplotlib on GCP, excel on GCP, which I don't think add much value here, it's enough to show 1 example of GCP/S3.
  • Examples like saving HDF file doesn't worth the highlight in this page in my opinion, there could be more.

@datajoely
Copy link
Contributor

Can I ask that we include the example how to save a multi-sheet as well? The instructions to do so are buried in in the DataSet docstring:

you can include them under the "writer" key. Here you can

@merelcht
Copy link
Member

merelcht commented Aug 5, 2022

Side notes: Follow up on previous discussion about DataSet API Docs should be the 1st entry point for people looking up examples, I think we should do the followings:

  1. Remove these number, i.e. Example X, make them a sub-header like ## Save-an-image-xxxxx, so we can link these example directly in the doc, and also better SEO I think?
  2. Only keep some of the example in Data Catalog to showcase what Kedro can do, but leave the advance example in the Corresponding API Doc.

Yes we should, but let's do that as separate tasks to keep this PR focussed on one thing. We've got this issue for task 1: #1742. I'll update it so it covers point 2 as well.

@ankatiyar ankatiyar requested a review from idanov as a code owner August 9, 2022 16:17
Copy link
Contributor

@datajoely datajoely left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super close, slightly different wording suggested. Then I think we're good to go!

kedro/extras/datasets/pandas/excel_dataset.py Outdated Show resolved Hide resolved
kedro/extras/datasets/pandas/excel_dataset.py Outdated Show resolved Hide resolved
Copy link
Contributor

@datajoely datajoely left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥 awesome work @ankatiyar

Copy link
Contributor

@yetudada yetudada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Congratulations on your first PR @ankatiyar! 🥳 And thank you for helping our users with this! We get this question a lot.

I'm approving this but I just note that we use excel (lowercase) when it should be Excel (uppercase). This actually correct based on what we have in our documentation at the moment.

Can you create a separate GitHub issue to address this as another quick-fix PR because I just checked the rest of the documentation and it looks like we've made that mistake in other places too?

@ankatiyar ankatiyar mentioned this pull request Aug 11, 2022
@merelcht merelcht merged commit 68d982e into kedro-org:main Aug 18, 2022
@ankatiyar ankatiyar deleted the multisheet_excel_example branch August 31, 2022 11:17
deepyaman pushed a commit to deepyaman/kedro that referenced this pull request Sep 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Document how to read and write a multi-sheet excel
6 participants