Flexible, understandable, universal – Python has a lot to recommend it both for those dipping their toes into the waters of programming and for experienced coders.
Calvium’s Managing Director Jo Reid spoke to Alan Steven, a Senior Principal Scientist at pharmaceutical company CatSci, about why he took an introductory Python course recently, and the new knowledge he will bring to his work.
Calvium recently collaborated with Alan on LabLinks, a platform that allows chemists to collaborate across company boundaries around the world. The online space can be used for sharing knowledge, networking, events, eLearning courses, innovating, or simply sounding out ideas and asking questions – like, how can I convince my team they should learn the basics of Python? Or, what’s the difference between lists and tuples?
Jo also spoke with recent MSc Computer Science graduate Lizzie Hull about why Python was an obvious choice to add to her programming portfolio, and snagged some useful tips for beginners.
Jo: Alan, in your career I’ve noticed there’s a recurring theme about you wanting to innovate and shake up processes. Was that part of your motivation for learning to program in Python?
Alan: You are never too old to learn something new. I’m conscious that to keep up with the times, to be able to interact with anyone new coming into the company and to speak the language of the problems we’re facing, it’s necessary to learn new skills and to learn enough to engage with stakeholders and solve the problem that you are working on. You don’t need to be an expert.
An underlying reason was the move to automation in the industry as a whole with the emergence of pharma 4.0. Computers and machines are not afraid of repetitive tasks and they can always do things again and again to a certain standard to produce a product – material or data – of a certain quality. This frees manpower up for innovative thinking.
Related to that is a regulatory trend regarding the data package required when a medicine is being assessed by a regulator. The expectations around this are becoming increasingly stringent and data-driven as the innovator tries to demonstrate understanding of their manufacturing process.
Finally, there is being able to use a language like Python for single-board computers, which increasingly have applications for different sorts of people. I can imagine using Raspberry Pi computers programmed with Python in our organisation to gather a lot of the data that a regulator might ask for. It would allow you to build a lot of homemade sensor kits that would cost tens of thousands of pounds if they were bought commercially.
Jo: What was it that drew you to learning Python in particular?
Alan: Python is described as being not the best at anything but second best at everything. It’s flexible and because it’s based on English, it’s easier to understand than other languages. I’ve heard it described as the ‘new language of science’. Instead of being multilingual in foreign languages, it will be programming languages that give scientists an edge. Python’s flexibility and its use across a number of different industries is attractive.
Jo: Was there a particular goal you wanted to achieve by learning Python?
Alan: I wanted to know what I didn’t know. Rather than imploring other people to start on their own programming journeys, I wanted to be a role model and show how it’s done, and show that having a busy schedule isn’t an impediment to fitting in a new skill. From the course I did recently, it’s still not possible for me to do anything really useful in a professional sense, but it has whet my appetite. I think what I will do now is to read some of the primer books and try to use it for some day-to-day activities, particularly office-based admin, rather than using it directly on my projects at work.
Jo: And Lizzie, what made you learn Python during your Master’s at Bristol University?
Lizzie: Employability was definitely a factor in wanting to learn Python, but the main reason I chose it was because it has many well-established and highly-regarded data science and machine learning libraries I could use for my final project. It also gave me access to a large community of data science practitioners whose expertise I was able to draw on.
Jo: Alan, would you recommend other people in your business take up Python now you have more of an understanding of how it might be integrated into their work?
Alan: I think everyone should have some appreciation of it. I’ve become aware that although it’s easy to pick up quite a lot quite quickly, you need to spend a concentrated period of time to get to an expert level, so there’s an inflection in the journey where you plateau and have to put in a lot of effort to increase your skill again.
It’s opened my eyes to what the rest of the journey could look like and made it easier for me to think about what it could do and how. If we’re trying to recruit as a business, we might recruit chemists who also know a bit about data science or data scientists who don’t know any chemistry but are able to go a lot deeper into data science.
Jo: Working in a multi-disciplinary team, you need to strike a balance between deep expertise and then more of a strategic view of how it fits together. Do you have tips for strategists about how in-depth they should get into data science?
Alan: I think it’s different for different people. I’ve always found that the people who can dive deep into another subject have a much better outlook on some problems because they’ve worked in two different fields and their brains have developed slightly differently. They know that what is simple for one group of people is actually quite difficult for another group of people. Having that sort of experience of two different disciplines can only really be an advantage.
Jo: Is there an advantage in transferring scientific thinking and approaches to learning programming?
Alan: Not so much for organic chemistry. Although chemistry does have a quantitative background, at the level that we use it on a day-to-day basis there are rules of thumb and ways of recognising patterns that allow you to solve the problems without really needing to have a lot of formal training in maths. There are some other areas of chemistry – like computational and physical chemistry – which are a lot more quantitative and where that crossover is a lot more natural.
Lizzie: I found my previous academic background in maths was very helpful for learning programming languages in general, and for the Python data science and machine learning libraries I used in particular. The undergraduate course I took in linear algebra was especially useful for understanding the theory behind machine learning, but university-level maths is certainly not a prerequisite to learning Python.
Jo: I’m also curious about the role of design. For example, experimental design – knowing what the hypotheses are and how you structure a problem in a chemistry sense, and then in a programming sense.
Alan: From the chemistry side of things, you put pure data into a means of processing something and you’ll get a pure answer out. It’s always about trying to have the right hypothesis and the right problem that you’re trying to solve and having the right approach.
We talk a lot about ‘starting with the end in mind’ and having a clear view of what the end state looks like, which is often having an effective medicine that works for the patient. When you’re trying to work through things, you always need to think ‘will this deliver something that is aligned to the end state or am I going off completely in the wrong direction?’ That will be the same for all sorts of problem-solving across different disciplines.
Jo: At the next layer up, something that’s very important for wider programming skills is user interface design and usability. Were you looking to learn Python for data down or would it help to be able to visualise and present data better and think about process flows?
Alan: Data visualisation is certainly important, and we talk a lot about how we interact with data. We’re trying to get away from having data in flat PDFs that can’t really be interpreted. I don’t really have any interest in the GUI or website design side of things or coding the algorithms into the product.
The language that is often talked about for doing data interpretation in our field is R as opposed to Python. R seems to be a bit more specialised. Being able to take large datasets and being able to visualise them in different ways with data analytics packages is increasingly important.
Jo: Any tips for other people who might be interested in learning a programming language?
Alan: One book that a number of people have recommended is Automate The Boring Stuff With Python. It seems to be useful to help with a lot of office-based activities. It won’t necessarily be able to give you a lot of the basic understanding, but it will allow you to automate a lot of common tasks.
I took a course, which was useful because there was homework that was marked and that stimulates you to try and do it. There was peer pressure and the materials that were provided were very good. All of the sessions were recorded, the answers to the problems worked through in the sessions were provided afterwards and there were bonus sessions that were discipline-specific like biochemistry or finance.
Lizzie: Since I already knew other programming languages, I used the SoloLearn app to learn the basics of the syntax, and I found that that was helpful. More generally, I would recommend searching for ‘Python resources masterlist’ or something similar. There are tonnes of resources of varying quality out there, so it’s useful to find a list that has been curated by an expert. Grokking Algorithms is a fun book that helped me learn some of the fundamental algorithms used in programming, and it includes code examples in Python. I watched lots of video tutorials while learning to code neural networks in PyTorch.
I’ve also used GitHub for both individual and group work. It takes a little while to get the basics down, but I’d definitely recommend learning how to use version control once you start doing more complicated projects. StackOverflow is invaluable for looking up or asking any programming questions. I also found it useful to read relevant sections of the PyTorch API and search for any terms or concepts that I didn’t understand, and you can do the same with any library or the official Python documentation.
Alan: I’ve heard about GitHub but I haven’t used it myself. Coming back to the hiring question, if somebody has put their own code on GitHub and they are putting it out there for criticism, it’s showing that they have something behind what they’re saying in terms of qualifications rather than saying they know a bit of Python but have no evidence of its use in the real world. If you’ve created some code on GitHub, that’s a good indicator that you’re someone who knows what you’re talking about.
Please visit LabLinks.io to join in the conversation about the use of Python for scientific innovation in the Digital Curiosity group.