63 lines
1.9 KiB
Markdown
63 lines
1.9 KiB
Markdown
# Ship vs iceberg discriminator
|
|
|
|
TL;DR: Discriminate between ships and icebergs from SAR imagery.
|
|
|
|
|
|
## Approach
|
|
|
|
Data augmentation and parameter sharing. CNN and ResNets.
|
|
|
|
|
|
## Data directory
|
|
|
|
The data directory is expected to have the structure as shown below:
|
|
|
|
data/
|
|
├── params
|
|
│ ├── base_cnn-scaling.pkl
|
|
│ ├── base_cnn-weights-loss.h5
|
|
│ ├── base_cnn-weights-val_loss.h5
|
|
│ ├── icenet-weights-loss.h5
|
|
│ └── icenet-weights-val_loss.h5
|
|
├── predictions
|
|
│ └── icenet-dev.csv
|
|
├── sample_submission.csv
|
|
├── test.json
|
|
└── train.json
|
|
|
|
where `{train,test}.json` is the data from the
|
|
[kaggle website](https://www.kaggle.com/c/statoil-iceberg-classifier-challenge).
|
|
|
|
|
|
## Log
|
|
|
|
### Residual base CNN
|
|
|
|
Summary:
|
|
|
|
* Test loss: 0.5099
|
|
* Test accuracy: 0.7932
|
|
* Epochs: 100
|
|
* Best val loss at epoch 70 (converged until 100, did not overfit)
|
|
|
|
Comments:
|
|
|
|
* Low variance -- training loss is consistently a bit lower than validation
|
|
loss.
|
|
* Since images are "artificially" labeled, it is hard to say what the bias is.
|
|
There should be some bias since this network does not overfit, and it also
|
|
looks like training converges after 100 epochs (with decaying learning rate).
|
|
* There may also be labeling noise. It is indeed suspicious that the validation
|
|
loss converges with very low variance. Perhaps revisit the labeling
|
|
approach for the base generator.
|
|
* Conclusion: Check labeling, then bring out the big guns and expand the
|
|
residual net.
|
|
|
|
With this model as a basis for the 9 regions, followed by a reshape, conv and
|
|
two dense layers, yields ok performance: Around 0.20 loss after few epochs.
|
|
|
|
However, validation loss is often lower than training loss. It might be that
|
|
the two distributions are not the same for both networks -- check the random
|
|
seed and verify! Might also be noisy training (because of augmentation).
|
|
|