Posted on :: Tags:

Sooner or later, most scientist have to automate certain tasks. It is generally advisable to leave such repetitive things that can be automated to the computer. Here we have a look at tools that might help with this task.

First things first: this is by no means going to be a comprehensive article, but will rather be my own, fairly biased view, which heavily focuses on python and git. It hopefully starts simple, and then gets more advanced fast.

Python

Python is a scripted computer language that allows you to do many scientific tasks, e.g., data evaluation, automatically. It is a fairly simple language to learn, and you can get going fast, however, be open to learn more advanced techniques later on. This means: Never stop learning! Python has many advanced capabilities that can make your life easier and faster.

Learning Python

Software Carpentry has two great lessons on python, which are especially suited for beginners. These lessons should give you an introduction and overview of the language, but also teach you how to plot figures using matplotlib.

A great book to introduce you python was written by Allen Downy and can be found here. It is available for free. The book gives an in-depth introduction into python, but also into the basics behind the language. Such knowledge is always helpful later on, since it gives you a deeper understanding of certain behavior.

Working with Python

The official python distribution is distributed from python.org. If you install this distribution, you can add further packages using pip. However, this is not always the most straight forward way to work with python.

Virtual environments

Virtual environments should be considered for all your projects. Depending on what python environment you are using, these environments will be created different. It is worth looking into it.

Anaconda

For data processing and scientific python, check out Anaconda. This gives you a conda environment for python. Furthermore, Anaconda comes with many packages pre-installed and has a graphical manager to handle your distribution. This can be advantageous, depending on what you need.

pyenv

If you need to handle multiple different python versions and want to easily manage virtual environments, check out pyenv.

Jupyter

Finally, Jupyter Notebooks give a straight forward interface to python that allow you to play and develop code in a browser, as well as to document it in Markdown. Googles flavor of Jupyter is called Google Collab and run on the web. JetBrains, the creator of PyCharm, also has an online Jupyter Notebook Server with is especially great for developing at the same time. It is called Datalore.

For Astrophysics, especially if you are interested in NuGrid data, check out the Astrohub. You can log into the public outreach server with your GitHub account and then use a JupyterLab environment to run your astrophysics models.

Editors

Many good python editors exist. Personally, I prefer PyCharm, however, many other options. PyCharm is a fully integrated developer environment (IDE) and comes with many more tools than you need in the beginning, it can therefore be overwhelming at first.

Other notable editors are Spyder, which is a full IDE that comes pre-installed with Anaconda. Also notable is Sublime Text, which works in my workflow especially well for scripting. Here is a great article on how to set up Sublime Text for python.

GUIs

If you are interested in creating graphical user interfaces for your programs, it is worth looking into GUIs. Two potential GUI creation packages that you might want to consider using are PyQt or PySide. Great tutorials on Qt can be found here.

If you want to package your GUIs with installers, check out fbs. Note that the open / free version only supports python-3.6 and is restricted to PyQt5. If you want to dabble with the pro version, let me know.

Advanced Python

Auto-formatting

Formatting python code should adhere - for readability - to certain rules. These are often also referred as linting requirements. While it is tedious to format code by hand, automatic formatters are very helpful. I generally use black to format my python code. The beauty of this is that it there are not many possibilities to format your code, therefore, most of the decisions are already made, and it always looks awesome. Various plugins exist that can be used in editors and IDEs. Search engines are useful to find them.

Test your code!

Testing of code is crucial, since you generally want to make sure that your scripts, functions, classes, etc., do what you want them to do. An amazing package to test your python code is pytest. If you are interested in learning testing with python, check out Brian Okken's book here.

Git

When working with code, version control should be an integral part of your workflow. One way of controlling versions is with git. The Galactic Forensics Laboratory has its git repositories hosted here on GitHub. Lab members get access to all repositories of the lab.

Learning git

The best way to learn git and / or to review your skill is by going through this course on Software Carpentry. The next step is then to use git and, if you want to keep your repos online, a service such as GitHub. You can also browse the git book, which is available for free here.

The beauty of git is that you can most of the time go back in time if you made a mistake. So don't worry if something happens! A good resource for these weird cases is Dangit Git.

Good practices

If you want to contribute code to a repository of which you are not a maintainer, you should fork the repository to your own GitHub account. Then create a branch with an appropriate name for the feature you want to contribute. Add your changes, push your branch to your fork, and then create a pull request where you describe what you have changed and why. Keep it short, but descriptive.

Most projects, e.g., the iniabu project have a developers guide that gives you additional information on how to contribute. For iniabu, you can find the guide, e.g., here in the docs.

Advanced git and GitHub

pre-commit

If you are using git regularly, especially on public projects, pre-commit can help you to automate tasks. You can install so-called hooks that help you perform various tasks, e.g., formatting, etc. Check out the pre-commit website.

GitHub Actions

For automatic testing on GitHub, consider using GitHub Actions. These actions can especially help when using continuous integration (CI).

Some more advanced resources

Code coverage

When testing your python code, it is useful to know how many lines of your code are actually tested by your test suite. To automate this process, you can, e.g., use GitHub hooks for coveralls.

Documentation

Last, but surely not least, you will likely make extensive use of great documentations that you can find online. For python code, automatic documentation using your doc strings can be really helpful. One tool to do so is sphinx. This is especially powerful in combination with ReadTheDocs, which can also be implemented with a GitHub hook.

Hypermodern Python

Finally, if you want to code in python using many bells and whistles, check out the blog articles on hypermodern python by Claudio Jolowicz. The series can be found here, the first article here.