Ship vs iceberg discriminator

TL;DR: Discriminate between ships and icebergs from SAR imagery.

Approach

Data augmentation and parameter sharing. CNN and ResNets.

Data directory

The data directory is expected to have the structure as shown below:

data/
├── params
│   ├── base_cnn-scaling.pkl
│   ├── base_cnn-weights-loss.h5
│   ├── base_cnn-weights-val_loss.h5
│   ├── icenet-weights-loss.h5
│   └── icenet-weights-val_loss.h5
├── predictions
│   └── icenet-dev.csv
├── sample_submission.csv
├── test.json
└── train.json

where {train,test}.json is the data from the kaggle website.

Log

Residual base CNN

Summary:

Test loss: 0.5099
Test accuracy: 0.7932
Epochs: 100
Best val loss at epoch 70 (converged until 100, did not overfit)

Comments:

Low variance -- training loss is consistently a bit lower than validation loss.
Since images are "artificially" labeled, it is hard to say what the bias is. There should be some bias since this network does not overfit, and it also looks like training converges after 100 epochs (with decaying learning rate).
There may also be labeling noise. It is indeed suspicious that the validation loss converges with very low variance. Perhaps revisit the labeling approach for the base generator.
Conclusion: Check labeling, then bring out the big guns and expand the residual net.

With this model as a basis for the 9 regions, followed by a reshape, conv and two dense layers, yields ok performance: Around 0.20 loss after few epochs.

However, validation loss is often lower than training loss. It might be that the two distributions are not the same for both networks -- check the random seed and verify! Might also be noisy training (because of augmentation).