Nuts and Bolts of Applying Deep Learning (Andrew Ng)

푸근한 앤드류 응 교수님의 세션.
큰 그림과 실제 연구에서의 팁 등을 포함하고 있는 좋은 세션. 간단히 메모.

 

DL. why r they working so well ?
amount of data
traditional learning algorithms
general dl
sequential model
image model
unsupervised / reinforcement learning
#2 end-to-end dl
movie review -> sentiment 0/1
image -> object recognition 1~1000
more complex states
image -> caption(entire sentence)
audio -> transcript
translation
parameters -> synthesize image
(how about music?)
speech
1) audio -> phonemes(basic units of sound) -> transcript
2) audio -> transcript
phonemes를 깠는데 언어학자들이 싫어했다.ㅋㅋ
ex. x-ray
1)tradiional image -> bone length -> age
2)end to end : image -> age
(but don’t have enough data)
ex.
image -> cars -> trajectory(hand-code) -> steering
  -> pedestrians
image -> steering
(not enough data..)
_bias/variance
goal : build human-level speech recognition system
train/val/test set
ex.1
let’s say
human-level error 1%
train set error 5%
val set error 6%
human error – train set error : bias
ex.2
human-level error 1%
train set error 2%
val set error 6%
train error – val error : variance
need regularizaion
ex.3
human-level error 1%
train set error 5%
val set error 10%
high bias, high variance
1)
ask yourself, is your training error high?
-> if so,
 – bigger model
 – train longer
 – new model architecture
-> if no,
2) is train-dev(val) error high?
-> if so,
 – more data
 – regularization
 – new model architecture
_automatic data synthesis
hand-engineered features
ex. OCR
speech : clean audio + random background sounds
nlp
video games(RL)
-> unified data warehouse
ex. speech recognition in car
50000h data from all sources on the web
+
10h actual data we recorded and made from the scenario
train / dev set from 50000h
test set from 10h
: bad idea
-> make sure dev set and test set are from the same distribution
train set from 50000h -> train 49980h/ train-dev 20h
dev / test set from 10h
(should be this way)
Measure performance
human-level performance: 1%
training set : 10% (high bias)
train-dev : 10% (variance)
dev : 10% (train-test mismatch)
test : 10% (overfitting of dev)
human-level performance: 1%
training set : 2%
train-dev : 2.1%
dev : 10% (high train-test set mismatch problem)
test : 10%
3) is your dev error high?
 – more data(similar to test set)
 – data synthesis (data augmentation)
 – new architecture
4) test set error high?
 – get more dev data
dl 모델들이 human error를 넘어가면 쉽게 나아지지 않는다.
optimal error rate (Bayes rate)
so long as you are worse than human,
u have good ways to make progress
 – get labels from humans
 – error analysis (lool up human insight)
 – estimate bias/variance effects
  : human error 이 현재의 문제가 bias 인지 variance인지 알려주는 척도가 되기 때문. 그래서 그 이상이 되면 지금의 문제를 어떻게 개선해야할지 알기 힘들다.
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s