Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion: One thing people really need to know that we haven't covered #207

Closed
lwjohnst86 opened this issue Sep 20, 2019 · 9 comments
Closed
Labels
discussion discussion before a proposal

Comments

@lwjohnst86
Copy link
Member

As per our meeting discussion, @gvwilson asked: "what's the one thing that people really need to know that we haven't planned to cover?" (see minutes for more details).

@lwjohnst86 lwjohnst86 added the discussion discussion before a proposal label Sep 20, 2019
@gvwilson
Copy link
Contributor

I think the sections on publishing and data inventory were two big gaps that are now being filled. I don't think we need anything on parallelism, job control, or performance tuning. I still think about regular expressions, but the one that sticks with me most is "how to use your editor efficiently". It seems like a small thing, but it's one of the differences I see between those who can program quickly and those who can't.

@ChristinaLK
Copy link
Contributor

I agree re: the editor. I was watching someone who has learned the command line shortcuts beyond CTRL-A and CTRL-E and was like...this would help me. Same applies to editor.

The only piece that I think is important for parallelism is designing your code in a modular way, so that it's easy to parallelize a chunk of it if needed.

@lwjohnst86
Copy link
Member Author

I can't recall if this is in there, but learning how to learn/learning how to use StackOverflow or Google to solve the problem you may have. But this is tricky to teach I feel.

@ChristinaLK
Copy link
Contributor

I put some of that into the "getting started" chapter for python: https://merely-useful.github.io/py/py-getting-started.html#py-getting-started-web-help

@k8hertweck
Copy link
Contributor

I would strongly support more exercises interspersed throughout the material that reinforce how to properly phrase a question for Googling. I think this is a skill that develops along with someone's mental model of how terminology and practice come into shape, and extra examples are really useful, especially in the first 25% of the material.

@elliewix
Copy link
Contributor

To @k8hertweck's comment, I agree and have a quick tutorial that could be adapted to this that's based on a study about how novices google

@joelostblom
Copy link
Contributor

joelostblom commented Oct 6, 2019

A few things on my mind, maybe this is somewhat covered in sections under development.

  • Project structure - sound folder organization and principles, e.g. https://drivendata.github.io/cookiecutter-data-science/ (edit: just noticed that we have a section for project structure)
  • When to modularize code and how to effectively work with your own package in a data analysis pipeline
    • In addition to what we have on package and function development, I am thinking about things like when to break logical segments into separate function, how to install editable packages, and how to automatically update packages in the IDE (e.g. autoload magic in jupyter notebooks).
  • Data narrative - learning basics around how to tell a story with data.
  • Create a skeleton for analysis with what is important to always include versus what is data dependent (I haven't looked for a specific guide for this, but a general skeleton for a workflow that is more succinct than the lectures and references where to read more, e.g. 1. look for missing data, 2. check column data types, 3. make an overview plot, -1 include version numbers of packages so others can reproduce, etc)

@joelostblom
Copy link
Contributor

@lwjohnst86 Regarding how to use SO, I have this section in py-dev.

@joelostblom
Copy link
Contributor

joelostblom commented Oct 8, 2019

@gvwilson I added some editor tips in #225 , is this around the level you were thinking or should it be more advanced?

Also, I don't think we properly introduce lists, tuples, and dictionaries for python anywhere?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion discussion before a proposal
Projects
None yet
Development

No branches or pull requests

6 participants