layout: post title: “Goals for this blog” categories: - medicine tags: - machine learning - programming - education
Why should physicians learn to code? - Not every physician needs to learn to code. We already have too much to learn and retain. Administrative work eats up our time outside of (or, too often, while) seeing patients. Even for those with “protected” research time, it is consumed by managing the lab, directing resources, and catching up on patient notes from yesterday, while actual scientific writing happens in time carved out of sleep in the late night or early morning. And what about our loved ones and the occasional fit of self-care? In other words, we are physician-scientists too, and we get it. It's hard. However, our goal in this book is to show how a little knowledge of basic programming principles, combined with practical examples you can copy, paste, and modify for your own use, can save time, enable analyses that were previously unthinkable, improve communication, help you grasp (and critique) the latest research, and, ultimately, do what we all came here to do: help people, using all of the scientific tools and human compassion we can muster. - Another argument against learning to code is the proliferation of point-and-click interfaces for doing machine learning, as well as companies that sell their programming services for modest fees. If other people have done or will do the coding for you, why learn? We have nothing against this approach, and hope that machine learning tools will continue to become more accessible while specialist companies become profitable. There is room for everyone. Three arguments for learning to do it yourself are that you can remain free, open-source, and up-to-date. - Free: Python and R are free programming languages. The algorithms you need for the vast majority of machine learning applications are prepackaged in free libraries. Proprietary and expensive software may offer niche advantages in particular content areas, but this is less and less true as the years go on and people publish equivalent yet free solutions. Companies and specialists that do machine learning for you are great, and can help with your most complex problems, but most of the time, frankly, you will have a bread-and-butter dataset you can analyze with simple code on your own computer. If you do it yourself you won't have to deal with financial, legal, IRB, and HIPAA beyond what you did to get the project going in the first place. You also have irreplaceable content expertise: you know your patients, the medicine, what questions to ask, and what bogus results look like. Finally, maybe your machine learning idea won’t work, but you won't know that until you've run a quick pilot. When you do work with a specialist, how much more powerful and streamlined will your conversation be if you have pilot data, or at least specific problems, to discuss? With a modest computer (most of our analyses are done on a low-powered hospital-supplied tablet PC) and a few free downloads, you can start today, no poorer or more frustrated with paperwork than when you started. This also means that researchers around the world, regardless of financial circumstance, can make an impact. Lastly, if you are an educator, there are no financial barriers to introducing your students to these concepts and tools. They're free. - Open-source 1: Python and R are open-source. Not only do you not have to pay for them, but you can crack them open and interrogate them all the way down to the 1s and 0s. Nothing is proprietary or hidden. You can modify them to suit your needs, or even publish your own custom version. This is also true for the majority of machine learning packages. Scientific reproducibility is at its maximum when the tools used for analyses are completely transparent. It is becoming common for researchers to publish not only their (HIPAA-compliant) raw data, but also the actual code. “Go ahead, run it and see if you get what we got.“ If you write the code yourself, you can easily share it, receive critique, and improve it. This is good science. - Open-source 2: When you see the words “open-source,“ you should immediately think “community.“ Researchers publish code not only to officially claim ideas, but to engage with communities. Online repositories such as GitHub make it easy to ask questions, suggest improvements, and give examples. We continue to be impressed by how willing researchers are to participate in community, and have had many productive conversations with authors and fellow users, including requests for features (that were then implemented), questions about problems, and suggestions about uses. People are excited for you to use their software, and want to help you take it in new directions. There are other programming-specific communities such as StackExchange, where programmers in every different field present their coding problems, some quite granular, and others give suggestions that are dynamically peer-reviewed and preserved online. The best answers have code you can copy and paste, and there are typically several approaches offered for the same problem. One of the first steps to becoming a good programmer is learning how to Google for your problem, find an answer (usually on GitHub or Stack), and modify the code you copied and pasted. You almost never have to reinvent any wheels. You just change out hubcaps and bolts here and there. This is possible because of the robust and generous communities that have built up around open-source software and ideals. - Up-to-date: The state of the art is online. When machine learning scientists publish new algorithms, they often publish an R or Python package to go along with the paper, so you can try it out right away. It takes time for these algorithms to make it into point-and-click interfaces, and some algorithms may never be available as an option in a prebuilt system. When they do show up, it may not be able to perform the necessary tweaks and adjustments to fit your use. (Imagine if you could only prescribe combination pills, and had no way to adjust the dosage of each component!) A little programming experience gives you the power and flexibility to use algorithms, new and old, in any combination you see fit. You will find that, compared to point-and-click interfaces, most tasks are easier and faster once you have the code for them, and you can always update them to suit your needs and preferences.