Library of Congress Cataloging‐in‐Publication Data Name: Esmaili, Rebekah Bradley, author. Title: Earth observation using Python : a practical programming guide / Rebekah B. Esmaili. Description: Hoboken, NJ : Wiley, [2021] | Includes bibliographical references and index. Identifiers: LCCN 2021001631 (print) | LCCN 2021001632 (ebook) | ISBN 9781119606888 (hardback) | ISBN 9781119606895 (adobe pdf) | ISBN 9781119606918 (epub) Subjects: LCSH: Earth sciences—Data processing. | Remote sensing–Data processing. | Python (Computer program language) | Information visualization. | Artificial satellites in earth sciences. | Earth sciences—Methodology. Classification: LCC QE48.8 .E85 2021 (print) | LCC QE48.8 (ebook) | DDC 550.285/5133—dc23 LC record available at https://lccn.loc.gov/2021001631LC ebook record available at https://lccn.loc.gov/2021001632
Cover Design: Wiley
Cover Image: © NASA
FOREWORD
When I first met the author a few years ago, she was eager to become more involved in the Joint Polar Satellite System’s Proving Ground. The Proving Ground by definition assesses the impact of a product in the user’s environment; this intrigued Rebekah because as a product developer, she wanted to understand the user’s perspective. Rebekah worked with the National Weather Service to demonstrate how satellite‐derived atmospheric temperature and water vapor soundings can be used to describe the atmosphere’s instability to support severe weather warnings. Rebekah spent considerable time with users at the Storm Prediction Center in Norman, Oklahoma, to understand their needs, and she found their thirst for data and the need for data to be easily visualized and understandable. This is where Rebekah leveraged her expert skills in Python to provide NWS with the information they found to be most useful. Little did I know at the time she was writing a book.
As noted in this book, a myriad of Earth‐observing satellites collect critical information of the Earth’s complex and ever‐changing environment and landscape. However, today, unfortunately, all that information is not effectively being used for various reasons: issues with data access, different data formats, and the need for better tools for data fusion and visualization. If we were able to solve these problems, then suddenly there would be vast improvements in providing societies with the information needed to support decisions related to weather and climate and their impacts, including high‐impact weather events, droughts, flooding, wildfires, ocean/coastal ecosystems, air quality, and more. Python is becoming the universal language to bridge these various data sources and translate them into useful information. Open and free attributes, and the data and code sharing mindset of the Python communities, make Python very appealing.
Being involved in a number of international collaborations to improve the integration of Earth observations, I can certainly emphasize the importance of working together, data sharing, and demonstrating the value of data fusion. I am very honored to write this Foreword, since this book focuses on these issues and provides an excellent guide with relevant examples for the reader to follow and relate to.
Dr. Mitch Goldberg Chief Program Scientist NOAA-National Environmental Satellite, Data, and Information Service June 22, 2020
ACKNOWLEDGMENTS
This book evolved from a series of Python workshops that I developed with the help of Eviatar Bach and Kriti Bhargava from the Department of Atmospheric and Oceanic Science at the University of Maryland. I am very grateful for their assistance providing feedback for the examples in this book and for leading several of these workshops with me.
This book would not exist without their support and contributions from others, including:
The many reviewers who took the time to read versions of this book, several of whom I have never met in person. Thanks to modern communication systems, I was able to draw from their expertise. Their constructive feedback and insights not only helped to improve this quality and breadth of the book but also helped me hone my technical writing skills.
Rituparna Bose, Jenny Lunn, Layla Harden, and the rest of the team at AGU and Wiley for keeping me informed, organized, and on track throughout this process. They were truly a pleasure to work with.
Nadia Smith and Chris Barnet, and my other colleagues at Science and Technology Corp., who provided both feedback and conversations that helped shape some of the ideas and content in this book.
Catherine Thomas, Clare Flynn, Erin Lynch, and Amy Ho for their endless encouragement and support.
Tracie and Farid Esmaili, my parents, who encouraged me to aim high even if they were initially confused when their atmospheric scientist daughter became interested in “snakes.”
INTRODUCTION
Python is a programming language that is rapidly growing in popularity. The number of users is large, although difficult to quantify; in fact, Python is currently the most tagged language on stackoverflow.com, a coding Q&A website with approximately 3 million questions a year. Some view this interest as hype, but there are many reasons to join the movement. Scientists are embracing Python because it is free, open source, easy to learn, and has thousands of add‐on packages. Many routine tasks in the Earth sciences have already been coded and stored in off‐the‐shelf Python libraries. Users can download these libraries and apply them to their research rather than simply using older, more primitive functions. The widespread adoption of Python means scientists are moving toward a common programming language and set of tools that will improve code shareability and research reproducibility.
Among the wealth of remote sensing data available, satellite datasets are particularly voluminous and tend to be stored in a variety of binary formats. Some datasets conform to a “standard” structure, such as netCDF4. However, because of uncoordinated efforts across different agencies and countries, such standard formats bear their own inconsistencies in how data are handled and intended to be displayed. To address this, many agencies and companies have developed numerous “quick look” methods. For instance, data can be searched for and viewed online as Jpeg images, or individual files can be displayed with free, open‐source software tools like Panoply (www.giss.nasa.gov/tools/panoply/) and HDFView (www.hdfgroup.org/downloads/hdfview/).
Still, scientists who wish to execute more sophisticated visualization techniques will have to learn to code. Coding knowledge is not the only limitation for users. Not all data are “analysis ready,” i.e., in the proper input format for