GithubHelp home page GithubHelp logo

alibaba-cloud-german-ai-challenge-2018's Introduction

Alibaba-Cloud-German-AI-Challenge-2018

这是天池大数据的比赛
https://tianchi.aliyun.com/competition/introduction.htm?spm=5176.100067.5678.1.3e7731f5WP7NmY&raceId=231683

网络采用了L_Resnet_E_IR, 损失函数基于softmax entropy

12.6日进展,线上排名前5%
1.数据集描述:
数据集分为训练集和验证集,图像包括sen1,sen2两部分,其中sen1是sar图像,分为实部和虚部,有两个通道是滤波处理的。sen2是高光谱图像

特点:
a.由两个卫星的图层叠加而成,可以视为32,32,18的图片
b.测试集和验证集均出现类别分布不平衡情况,可以视为多标签不平衡分类问题
c.sen1图层的像素波动很大(从负数到上千),sen2图层像素多为0-1之间的浮点数

2.训练模型(分为预处理,特征提取,模型训练,后处理4个模块,因为各个模块难度不同,从简单往难的做, []表示待执行的步骤)

网络训练模块:
1.利用训练集,迭代70000步,在训练集上达到过拟合(拟合度100%),在验证集上面准确率在60%左右
2.训练期间,实现early stopping,34000步时达到最优,此时对于训练集拟合度在90%,验证集准确率61%
3.结合1,2,可以看出训练集过不过拟合对于验证集的分类没有太大影响
4.提交在验证集上性能最优的模型(61%准确率),线上测试集效果不到60%
5.从测试集和验证集准确率近似,提出猜想,测试集和验证集的分布近似,所以决定利用验证集来辅助模型训练
6.将训练集和验证集融合起来进行训练,达到过拟合(100%)后提交结果,线上效果达到75%
[1],[2],[3]

预处理模块:
1.对于数据分布不平衡的问题,采用重复过采样,保证训练时的数据分布平衡
[4]

特征提取模块:
对sen1,sen2做数据融合,参考blog http://blog.sina.com.cn/s/articlelist_1984634525_4_1.html

后处理模块:
[5],[6],[7]

todo list:
[1]调整weight_decay loss,利用正则化减小过拟合
[2]尝试不同激活函数
[3]保存过拟合过程中的最优模型提交线上测试,暂定方案从验证集取出5000个样本作为验证,其余加入训练集训练
[4]图像归一化处理,镜像堆对称,随机裁剪,提取中心和四角的子图片x5
[5]先利用神经网络学习特征,然后获取神经网络最后一层的向量,利用传统分类器,如GBDT,LightGBM来分类
[6]获取向量后利用auto-sklearn来分类
[7]多孔径和高光谱分别训练神经网络后再集成

alibaba-cloud-german-ai-challenge-2018's People

Contributors

colabin avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.