Sooner or later, most scientist have to automate certain tasks. It is generally advisable to leave such repetitive things that can be automated to the computer. Here we have a look at tools that might help with this task.
First things first: this is by no means going to be a comprehensive article,
but will rather be my own, fairly biased view, which heavily focuses on
python and git.
It hopefully starts simple, and then gets more advanced fast.
Python
Python is a scripted computer language that allows you to do many scientific tasks, e.g., data evaluation, automatically. It is a fairly simple language to learn, and you can get going fast, however, be open to learn more advanced techniques later on. This means: Never stop learning! Python has many advanced capabilities that can make your life easier and faster.
Learning Python
Software Carpentry has two great
lessons on python, which are especially suited for beginners. These lessons
should give you an introduction and overview of the language, but also teach
you how to plot figures using matplotlib.
A great book to introduce you python was written by Allen Downy and can
be found here. It is available
for free. The book gives an in-depth introduction into python, but also into
the basics behind the language. Such knowledge is always helpful later on,
since it gives you a deeper understanding of certain behavior.
Working with Python
The official python distribution is distributed from
python.org.
If you install this distribution, you can add further packages using
pip.
However, this is not always the most straight forward way to work with python.
Virtual environments
Virtual environments should be considered for all your projects. Depending
on what python environment you are using, these environments will be
created different. It is worth looking into it.
Anaconda
For data processing and scientific python, check out
Anaconda. This gives you a
conda environment for python. Furthermore, Anaconda comes with many
packages pre-installed and has a graphical manager to handle your distribution.
This can be advantageous, depending on what you need.
pyenv
If you need to handle multiple different python versions and want
to easily manage virtual environments, check out
pyenv.
Jupyter
Finally, Jupyter Notebooks give a straight forward interface
to python that allow you to play and develop code in a browser, as well as
to document it in Markdown.
Googles flavor of Jupyter is called
Google Collab and run on the web.
JetBrains, the creator of PyCharm, also has an online Jupyter Notebook
Server with is especially great for developing at the same time. It is
called Datalore.
For Astrophysics, especially if you are interested in NuGrid data, check out the Astrohub. You can log into the public outreach server with your GitHub account and then use a JupyterLab environment to run your astrophysics models.
Editors
Many good python editors exist. Personally, I prefer PyCharm, however, many other options. PyCharm is a fully integrated developer environment (IDE) and comes with many more tools than you need in the beginning, it can therefore be overwhelming at first.
Other notable editors are Spyder, which is a full IDE that comes pre-installed with Anaconda. Also notable is Sublime Text, which works in my workflow especially well for scripting. Here is a great article on how to set up Sublime Text for python.
GUIs
If you are interested in creating graphical user interfaces for your programs,
it is worth looking into GUIs. Two potential GUI creation packages that you
might want to consider using are
PyQt or
PySide.
Great tutorials on Qt can be found here.
If you want to package your GUIs with installers,
check out fbs. Note that the open / free
version only supports python-3.6 and is restricted to PyQt5. If you want
to dabble with the pro version, let me know.
Advanced Python
Auto-formatting
Formatting python code should adhere - for readability - to certain rules.
These are often also referred as linting requirements. While it is tedious to
format code by hand, automatic formatters are very helpful. I generally use
black to format my python code.
The beauty of this is that it there are not many possibilities to format your
code, therefore, most of the decisions are already made, and it always looks
awesome. Various plugins exist that can be used in editors and IDEs.
Search engines are useful to find them.
Test your code!
Testing of code is crucial, since you generally want to make sure that your
scripts, functions, classes, etc., do what you want them to do.
An amazing package to test your python code is
pytest. If you are interested in
learning testing with python, check out Brian Okken's book
here.
Git
When working with code, version control should be an integral part of your
workflow. One way of controlling versions is with git.
The Galactic Forensics Laboratory has its git repositories hosted
here on GitHub. Lab members get
access to all repositories of the lab.
Learning git
The best way to learn git and / or to review your skill is by going through
this course on Software Carpentry.
The next step is then to use git and, if you want to keep your repos online,
a service such as GitHub. You can also browse the git
book, which is available for free here.
The beauty of git is that you can most of the time go back in time if you
made a mistake. So don't worry if something happens! A good resource for
these weird cases is Dangit Git.
Good practices
If you want to contribute code to a repository of which you are not a maintainer, you should fork the repository to your own GitHub account. Then create a branch with an appropriate name for the feature you want to contribute. Add your changes, push your branch to your fork, and then create a pull request where you describe what you have changed and why. Keep it short, but descriptive.
Most projects, e.g., the iniabu
project have a developers guide that gives you additional information on
how to contribute. For iniabu, you can find the guide, e.g.,
here in the docs.
Advanced git and GitHub
pre-commit
If you are using git regularly, especially on public projects, pre-commit
can help you to automate tasks. You can install so-called hooks that help
you perform various tasks, e.g., formatting, etc. Check out the
pre-commit website.
GitHub Actions
For automatic testing on GitHub, consider using GitHub Actions. These actions can especially help when using continuous integration (CI).
Some more advanced resources
Code coverage
When testing your python code, it is useful to know how many lines of your
code are actually tested by your test suite. To automate this process,
you can, e.g., use GitHub hooks for coveralls.
Documentation
Last, but surely not least, you will likely make extensive use of great
documentations that you can find online. For python code, automatic
documentation using your doc strings can be really helpful. One tool to do so
is sphinx.
This is especially powerful in combination with
ReadTheDocs, which can also be implemented with
a GitHub hook.
Hypermodern Python
Finally, if you want to code in python using many bells and whistles,
check out the blog articles on hypermodern python by Claudio Jolowicz.
The series can be found here, the
first article
here.