Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make explicit in pandas docs the imports and the options #149

Open
datapythonista opened this issue Aug 21, 2019 · 5 comments
Open

Make explicit in pandas docs the imports and the options #149

datapythonista opened this issue Aug 21, 2019 · 5 comments
Labels

Comments

@datapythonista
Copy link
Member

datapythonista commented Aug 21, 2019

See pandas-dev/pandas#28038

Until now, there has been a hidden code block at the beginning of every documentation page with imports, random seeds and options. There is agreement to make that code explicit, so the users can reproduce exactly the code, and there is no "magic" going on.

Let's start by opening a PR to remove that header in this page: https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html

My preferred option is to remove the {{ header }} variable, keep the currentmodule in its place, but not the code block. Then, just add what it's needed in the code blocks when things are first used. For example, in the first block code add import pandas as pd at the beginning. We probably only need the code with the random seed in the block with the first use of numpy.random.

After the change it can be good to run a diff of the html version before the change, and after the change, and see how much did it change (and add that diff to the PR description).

Note that several people in the pandas team is likely to have opinions on how this change should be implemented. So be prepared for several reviews, and several iterations of proposed changes and questions. :)

@TanyaaCJain
Copy link
Contributor

I would like to work on this issue.

@TanyaaCJain
Copy link
Contributor

TanyaaCJain commented Aug 21, 2019

Should we let remain using of aliases like import pandas as pd? Also, from where is the value of {{ header }} being taken by the docs?

@datapythonista
Copy link
Member Author

header is defined in doc/source/conf.py, the configuration file of Sphinx. For now we need to keep using aliases in pandas. Hopefully that will change soon, but this is the policy at the moment.

@TanyaaCJain
Copy link
Contributor

On the fist occurence of os, I imported the library and inserted the code os.chdir(r'{}') present in the {{header}}. On building docs/source/user_guide/io.rst, it gives the FileNotFoundError: [Errno 2] No such file or directory: '{}'. Am I supposed to include os.chdir(r'{}') ? I am not sure which directory this code points to.

@datapythonista
Copy link
Member Author

That part is a bit tricky. I think you should be able to not have that, and everything will work well if the sphinx build called from the right directory.

The chdir is not useful for the calls to the os module functions, but it is to set the path where files will be looked for. For example, if there is a pandas.read_csv('data/foo.csv'), that {} represents the directory /home/user/pandas if the file is in /home/user/pandas/data/foo.csv.

If you check in conf.py the header, you will see that there is a .format() in the string, where the value is set.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants