With an impressive background in consulting for government clients in emergency management, defence, and intelligence, Heidi brings a wealth of knowledge and experience to the table. Currently working as a Machine Learning Engineer at Pachama, a groundbreaking startup dedicated to leveraging technology to combat climate change, Heidi’s insights shed light on the exciting possibilities of satellite imaging.
Heidi, what motivated you to work in the field of satellite imaging?
My background is in mathematics. I was always passionate about the intersection of mathematics and geography and looking at how we can bring mathematical methods to bear on geographic questions. That naturally led to an interest in satellite imagery.
If we have access to no constructed maps or no survey maps; but real time (or near real time) information about the surface of the planet…I realised that I could use a lot of the same mathematical tools that I had learned and apply them to satellite imagery to extract these geographic impacts. So, my interest was always in applying mathematics to these mathematical-related questions.
What’s most interesting to you about the field of satellite imaging?
The surface of the earth is always changing. The location of cars is always changing. The buildings that people are putting up and taking down are always changing. What excites me most about this field is having the opportunity to have an unprecedented view. It’s almost as if you’re an astronaut at the international space station looking at the surface of the earth to understand how things are changing – and then to trace those changes all the way down to the level of impact at the individual level, at the societal level, and to really understand how these systems are connected.
Can you explain what the differences are between satellite image data to the data that we use in our daily lives?
So often, when people think about computer vision or processing imagery, they see examples of models that can detect the difference between a cat and a dog. And what we’re doing with satellite imaging is in some ways similar, but in some ways very different because the imagery itself is very different.
The type of imagery that comes from your phone has three channels, RGB: red, green, and blue. Some satellite imagery has those channels as well, but may have additional channels, too. I’d say there are three differences between regular imagery and satellite imagery. One is the resolution. The second is the band, so the wavelengths are captured by the sensors – and the third is the spatial information or metadata that comes with this.
In satellite imagery, the resolution is limited by the sensor that you’re working with. With common resolutions, if you have something that’s very high resolution, each pixel might correspond to 20 centimetres on the ground, whereas with something that’s very low resolution like the Landsat satellites, it might correspond to 15 centimetres, but the resolution has physical meaning in the real world. Th second is the spectral band.
Take traditional imagery…if you just take a picture with your phone, it will have three bands – red, green, and blue. With satellite imaging, some satellites have additional bands. So, they’ll have near infrared bands or pan (panchromatic) bands that provide additional information that can be used to detect things that humans can’t see, which again, from a data processing perspective, is a far more interesting question. We don’t just want to train algorithms to see humans.
And then, on the last point about the differences is the spatial information and the metadata. When you take the information – such as taking an image from a satellite – it will contain information about where on earth that is, and the altitude, the angle, the time of day. All of which provides additional metadata that can be used to build advanced models about what’s happening within that image.
How is the data acquired, and which types of data are you actually using for your analysis?
There are a variety of different satellites out there that have different resolutions. And in addition, there are a number of other platforms besides just satellites. So, with regards to satellites, there are commercially available sources – the likes of planet labs, digital globe’ those are commercially available sources. There are also government sources that are publicly available.
For example, with Landsat data, this is very coarse resolution data. That’s great for land cover and vegetation and disease. And then there’s also government sources that are not publicly available, and in addition to the satellite imagery sources, there are other sources from lower altitude platforms. In particular, one area of interest right now in terms of research and development is something called HAPS, which is High Altitude Platform systems. These are systems that operate at the altitude of a weather balloon, so lower than a satellite but higher than a plane.
There are also systems that can persist in the atmosphere for a significant amount of time on the order of hours, days, and weeks, but not years. A satellite’s advantage is that they can be beneath some of the clouds and you can receive different types of imagery. Imagine if you have a similar sensor on a weather balloon, then on a satellite. You’re going to get higher resolution data, and you’re also going to be able to avoid some of the atmospheric influence from clouds and other things. There’s a variety of sensors available in this space, and that’s not to mention the traditional imagery sources from aircraft or imagery sources from drones.
What limitations and challenges are there with Satellite Imaging data? How do you overcome these?
There’s certainly no scarcity of challenges in this domain. I will point out one issue that you mentioned, and that I’ve mentioned previously: the weather.
So, you can imagine there are a lot of objects of interest, particularly around object detection in the national security domain. Alot of these objects of interest aren’t found in perfectly sunny places with nice weather, and in particular, trying to find objects in snowy conditions, in winter conditions and in low light conditions present very serious challenges. Both from an object detection standpoint, but also from an imagery sourcing standpoint, if you have outdated imagery, it’s going to be very difficult to find things.
Another challenge that we face – and I think this is a challenge that’s quite common to a lot of people working across data science as a discipline – is data labelling.
If we’re building a training algorithm or we’re building it as a detection algorithm, we need a training data set that contains appropriately labelled instances of whatever it is we’re trying to detect. Now, in some cases, for example, we have commercial applications that count the number of cars in a parking lot. It’s not difficult to obtain and label a significant Corpus of information to allow these algorithms to be successful. For instance, with rare classes of aircraft in the winter, it’s very difficult to attain the base data that’s needed to train up these models.
What developments are happening in your field that you are most excited about?
I think it’s a really exciting time. If we think back to the very genesis of satellite imaging, when the government first started with satellite imagery, the development has been wild. Back then they were using film cameras, it was all classified and the film was designed to disintegrate on impact with water. And so the satellites would drop film cameras, and then military aircraft would be responsible for retrieving those before they hit the water and disintegrated. We’ve gone from those incredibly low-resolution images that had to be poured over by classifying analysts, to cube sets that are the size of a loaf of bread that can be rapidly iterated on. And so the transformation that we’ve seen just on the technology side, even just on the sensor side, has been dramatic.
“[Some satellites] have near infrared bands or pan (panchromatic) bands that provide additional information that can be used to detect things that humans can’t see, which again, from a data processing perspective, is a far more interesting question.”
There are a number of satellite imagery companies out there with really bold ambitions of imaging, working hardware side or the software side of the house.
There has also been significant advancements in convolutional neural networks. It’s a very fast-moving field in terms of new network designs and new network architectures that are coming out. One advance in the field that I’m very excited about is synthetic imagery and synthetic data. As I mentioned before, if we’re trying to detect a particular class of something – say something that only has existence in synthetics fill – generating that seemingly plausible network, then you can train the network. There are a lot of challenges that come with that.
Computers are very good at finding out how other computers made something. This ultimately doesn’t always map exactly to real world data, but I think synthetic imagery for EO (earth observation satellites), and also for SAR (synthetic aperture radar), is going to prove a very interesting field. In particular, for rare classes, difficult weather conditions, and all of these areas that we mentioned before where there are challenges in getting high quality label training data.
Creating satellite image data that covers the entire planet can now be taken in about 20 minutes and hopefully soon in real time – do you think this is realistic?
I think that’s something that folks have been excited
about for a long time. We’ll see how long it takes. We’ve certainly moved a lot in the last five years, but I do remember five years ago, people saying it’s right around the corner. To be honest, some satellite constellation might have been during the COVID pandemic, in terms of some of the launches and the ability to get hardware ready. We’re seeing some launches for both satellites and half platforms being impacted by COVID, which was unexpected.
I think it may happen. I certainly welcome it as a data source, but I wouldn’t hang my hat on it happening anytime in the next two years.
What are some of the more current applications in industry for satellite imaging?
One area of economic interest or commercial interest that I think is quite interesting is estimating global oil reserves. This is something that my previous company has a patent for and developed an algorithm to estimate the volume of oil in floating roof storage tanks.
These tanks are large cylinders where the roof of the tank moves up and down, depending on how much oil is present in the tank. If you have an understanding of the sun, this is where shadows can be an asset instead of a detriment. If you have an understanding of the angle it
“We’ve gone from those incredibly low-resolution images that had to be poured over by classifying analysts, to cube sets that are the size of a loaf of bread that can be rapidly iterated on.”
was taken at, the time and the sort of metadata that’s available from the satellite, you can estimate the volume of oil, scale that up to the number of known oil fields that you can get imagery for, and you can start to get a really good estimate of how much oil is out there. This allows you to anticipate some of the market changes in the price of oil and the availability of oil, which is a really interesting and difficult space to work in. There’s a lot of different applications for agricultural predictions. Of course, some of them are economics. Some of them are from a planning perspective. But, another interesting application of agricultural prediction is national security. So, when you think about what is a root cause that can exacerbate conflict; a lack of access to resources – and in particular famine – is a circumstance that can really worsen conflict. The ability to monitor the state of agriculture on a global scale and anticipate months in advance, or even years in advance, when you might be experiencing acute famine can help understand the sort of geopolitical tensions this could lead to.
Is there a commercial use case for satellite imaging data in assessing supply chain risk in advance?
My inclination with that is if you can combine it with
other types of information, then it could prove really useful. So, for example, you know the areas of interest for different manufacturing plants, or, you know where the semiconductors tend to be staged before they’re shipped, and you can detect activity in those areas of interest, that’s one way that this could be used to determine potential supply chain issues in advance.
Another method is if you can combine satellite imaging data with other information. There was a really exciting piece of work where we were tracing palm oil supply chains using geolocation voluntarily provided as well as satellite imagery. So, combining those sources can definitely give you an estimate if you understand how goods are flowing specifically, and where goods are flowing, because often providers don’t necessarily know the exact source of their upstream products. If you can use data to trace back to where exactly you are getting your palm oil or your raw materials from, then you might be able to detect concerns (e.g. decreased economic output or decreased availability of resources). From that you can then anticipate that this might be an issue in one month or two months.
In our next issue we talk to Heidi about how satellite imaging is helping to tackle Climate Change.