Quantcast
Channel: 懒得折腾
Viewing all articles
Browse latest Browse all 764

Using Deep learning to identify natural disasters from satellite images

$
0
0

Part1-Using Deep learning to identify natural disasters from satellite images

The idea that lead to this mini-project was how effective will CNN be to identify natural disasters when fed satellite images.An accurate system which is capable of tagging the type of disaster using close to real time satellite images will lead to better response and relief to the impacted areas.

Data used: RGB Satellite images of various resolutions across multiple disaster types from NASA earth observatory were used as training and validation sample.(Refer below image-1).

Image1-RGB X 2967 X 2967 Satellite image of a wild fire(Source-NASA Earth observatory)

Data extraction-For extracting the images from NASA earth observatory I used the really cool eonet API (https://eonet.sci.gsfc.nasa.gov/api/v2.1/events).The API allows you to pull the required images url using GET request.The images are tagged as per the natural disaster type(Example-Fire,drought,storms etc).I personally used requests in python to extract the required urls and their corresponding labels.To extract the images I used a mix of url2,xml.html & requests packages.

#Below code should be able to download images to your local from a csv of links
import csv
import urllib
import lxml.html
import requests
connection = urllib.urlopen(url)
with open('/required_url.csv') as csvfile:
    csvrows = csv.reader(csvfile, delimiter=',', quotechar='"')
    for row in csvrows:
      if 'view.php' in row[0]:
        filename = row[1]
        url = row[0]
        locn=row[2]
        print (locn)

One of the important points I wanted to study was the impact a Deep learning model can create when there is lack of data.To make sure my data sample is not image rich I used only 80 images.To keep the task simple I decided not to move into multi class classification and limited the labels to two classes only(Storm_wildfires /others).The end aim for my model was to be able to identify if in an image there is either a storm or a wildfire(every other natural disaster it should classify as others).

The machine learning/data science/deep learning community comprises of some of the most amazingly smart individuals who are developing cutting edge tools and algorithms which will change the world as we know it.They are also really kind enough to share these with rest of us so that we don’t need to reinvent the wheel.I used transfer learning facilitated by fastai library developed by the team @fastai . Most of the methodologies that were used in this experiment were inspired by the Deep learning part-1 course.I used resnet32 pre-trained on 1000 imagenet classes for the below experiment

Steps followed:

1)Image resizing

2)Identifying the optimum learning rate

3)Training the output layer while keeping rest of the weights of initial layers fixed

4)Retraining the complete model

5)Analyzing the results

  1. Image resizing

As I mentioned before most of the images were of different resolutions.I wanted to make sure that we had a uniformity across the image resolutions so that GPU computing could be optimum.Below is an example of how the images looked post resizing:

2)Identifying the optimum learning rate

One of the most amazing tool that was showcased by Jeremy in DL-1 course is the learning rate finder.Finding the perfect learning rate is always a big pain for ML practitioners across the methodologies(RF,Gradient boosting,DL).Using the learning rate finder below is how the graph looked like.As the training data sample is quite low for me I decided to train the model using a learning rate of 0.001 to make sure the model trains slowly without jumping to conclusions.

3)Training the output layer while keeping rest of the weights of initial layers fixed

This step pushed me towards an accuracy of ~65%.Which when compared to one of the papers I was referring to looked quite bad.

Transfer Learning from Deep Features for Remote Sensing and Poverty Mapping(Michael Xie and Neal Jean and Marshall Burke and David Lobell and Stefano Ermon)

4)Retraining the complete model

Now the resnet34 is pre-trained using 1000 class objects of imagenet.These 1000 objects are not exactly suitable to be used for transfer learning in the case of satellite images .Hence,I retrained all the layers of imagenet for our use case.The results were quite impressive as i now moved to an accuracy of ~75%.One of things that I observed that the change in accuracy across the epoch is not that stable which could be due to low training sample <100 due to which the function that the model was trying to fit to is not exactly stable which may lead to fall in performance when I will move out of sample.

5)Analyzing the results

Let look how the model is predicting:

A few correct images at random

model correctly predicting storm and wildfires

A few incorrect images at random

Model not able to differentiate between other natural disasters and storm and wildfires

Model not able to differentiate between other natural disasters for example snow and what we are predicting for- that is storm and wildfires(if you think it can be tricky even for a human eye as both disasters pertain to searching for white substance (sleet in case of snow and smoke in case of fire).This may confuse any man or machine.The solution to that could be more training data which would give the machine more data points to drive correct conclusions.

Most correct Storms and wildfires

Most correct Others (that is not a storm or wildfire)

Most incorrect storm or wildfires

Most incorrect others(that is not a storm or wildfire)

Conclusion

Although our deep learning model to predict which of the natural disasters is a storm or wildfire comes up with an accuracy ~75% which looks good (when compared to the academic literature I was following), the model seems to be unstable .Due to the confusing nature of different disasters images(where a snow storm may look similar to a wild fire) even a human eye may be confused.May be using more training points(I used only ~80 images in this experiment should be more fruitful.

Next steps

  1. Source more satellite images of natural disasters
  2. Try training on the larger sample size
  3. Further simplify the problem statement (may be try to predict snow storm from wild fire) rather than having a mix bag in the other class

Heartiest thanks to Jeremy Howard & Rachel Thomas and the complete team at Fastai for this amazing MOOC and fastai tool kit.The top down learning methodology should make everyone love Deep learning due to the utility focus and wide use cases it can be applied to.Just can’t wait for part2 of the 2018 course to be released online.

For how the results looked once I tried to tackle it in my next iteration please read Part2 of my experiment

Part-2 :Using Deep learning to identify natural disasters from satellite images

This post is a quick follow up on (follow the link to revisit it)part1 of the experiment where I am trying to predict natural disasters using satellite images from NASA earth observatory.

Quick recap :

  1. I was able to get to ~75% accuracy using resnet32 model using ~80 images.
  2. I wanted to push the boundaries of this result using data augmentation,adding more data and may be better model selection.

Taking it forward from part-1. I decided to do the following:

Step-1:Try to identify if there are ways I can add more images belonging to the either classes.On second thoughts I kind of decided against that, as I wanted this to be a limited data problem.

Step-2:Improving data augmentation was my key tactic.I changed the image size to 512 X 512 and zoomed it to 2.0 times its original size.The intuition behind that was as was observed of the images the model was not able to identify correctly there was high level of detailing which we want the model to train on.For this higher zoom should give model a better chance of getting the details correct.If you look at the below images all of them are confusing in themselves and hence tagged wildfires as others, a higher zoom should allow the DL algorithm to look into the details of an image.

Images of storms and wildfires which were incorrectly classified in the last iteration

Step-3:Change the model architecture.I moved myself from Resnet32 to resnet50 .The intuition behind that was as this is a complicated problem with limited number of training samples we will need more number of layers to account for higher level of abstraction.

Post this I proceeded with the other steps of model development LR finder,Training of final layers and then retraining weights of all the layers.

I needed to change the learning rate to even lower than what the learning rate finder indicated,if not, the model was not able to hit the function minima.Once this was done,below are how the final results look like.

Resnet50 trained on higher zoomed data augmentation training data

Voila!I was able to hit an accuracy of ~81.25%.This is an increment of almost 600 bps over the last training using resnet32.

Learnings from the mini project:

  1. When in doubt try to simplify things.Try to remove levels of abstractions and think from fundamentals(A complicated problem may need a complicated solution for example moving to resnet50 from resnet32)
  2. If your training sample may confuse a human it may confuse a machine or algorithm too.If there are chances that a human will miss-represent a snow storm from a wild fire in a satellite image then even a machine will.What will a human do?A human will take time to define his conclusion(low learning rate)and further looking into the details of image may help(zooming into the image)

Hence,the exercise here reemphasizes the fact that we need to know our data or problem area well before any successful predictive algorithms can be developed.

 


Viewing all articles
Browse latest Browse all 764

Trending Articles