※ 김성훈 교수님의 [모두를 위한 딥러닝] 강의 정리
- 참고자료 : Andrew Ng's ML class
1) https://class.coursera.org/ml-003/lecture
2) http://holehouse.org/mlclass/ (note)
1. Hypothesis and Cost
- 기존 Hypothesis and Cost
- 단순화한 Hypothesis and Cost
2. Gradient descent algorithm
- cost 최소화 등 많은 최소화 문제 해결에 활용되는 알고리즘
- cost(W,b)의 경우, cost를 최소화하는 W, b 값을 찾아냄
- 미분을 하여 기울기가 최소가 되는 점이 cost가 최소가 되는 점
- linear regression을 사용할 때, cost function이 convex function(볼록함수)이 되는지 확인해야 함
-> 그래야 정확한 값 획득 가능
3. Linear Regression의 cost 최소화의 Tensorflow 구현
import tensorflow as tf |
import matplotlib.pyplot as plt // "pip install matplotlib" 명령어 실행을 통해 사전 설치 필요 (그래프 그려주는 라이브러리) |
X = [1, 2, 3] |
Y = [1, 2, 3] |
W = tf.placeholder(tf.float32) |
# Our hypothesis for linear model X * W |
hypothesis = X * W |
# cost/loss function |
cost = tf.reduce_mean(tf.square(hypothesis - Y)) |
# Variables for plotting cost function |
W_history = [] |
cost_history = [] |
# Launch the graph in a session. |
with tf.Session() as sess: |
for i in range(-30, 50): |
curr_W = i * 0.1 |
curr_cost = sess.run(cost, feed_dict={W: curr_W}) |
W_history.append(curr_W) |
cost_history.append(curr_cost) |
# Show the cost function |
plt.plot(W_history, cost_history) |
plt.show() |
4. Gradient descent algorithm 적용
(1) 미분을 이용한 알고리즘 수동 구현
# Minimize: Gradient Descent using derivative: W -= learning_rate * derivative |
learning_rate = 0.1 // 알파 값 |
gradient = tf.reduce_mean((W * X - Y) * X) |
descent = W - learning_rate * gradient |
update = W.assign(descent) |
- full code
import tensorflow as tf |
tf.set_random_seed(777) # for reproducibility |
x_data = [1, 2, 3] |
y_data = [1, 2, 3] |
# Try to find values for W and b to compute y_data = W * x_data |
# We know that W should be 1 |
# But let's use TensorFlow to figure it out |
W = tf.Variable(tf.random_normal([1]), name="weight") |
X = tf.placeholder(tf.float32) |
Y = tf.placeholder(tf.float32) |
# Our hypothesis for linear model X * W |
hypothesis = X * W |
# cost/loss function |
cost = tf.reduce_mean(tf.square(hypothesis - Y)) |
# Minimize: Gradient Descent using derivative: W -= learning_rate * derivative |
learning_rate = 0.1 |
gradient = tf.reduce_mean((W * X - Y) * X) |
descent = W - learning_rate * gradient |
update = W.assign(descent) |
# Launch the graph in a session. |
with tf.Session() as sess: |
# Initializes global variables in the graph. |
sess.run(tf.global_variables_initializer()) |
for step in range(21): |
_, cost_val, W_val = sess.run( |
[update, cost, W], feed_dict={X: x_data, Y: y_data} |
) |
print(step, cost_val, W_val) |
(2) 텐서플로우의 Optimizer를 이용한 구현 (굳이 우리가 직접 미분하지 않아도 됨)
#Minimize: Gradient Descent Magic |
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1) |
train = optimizer.minimize(cost) |
- full code
import tensorflow as tf
tf.set_random_seed(777) # for reproducibility
# tf Graph Input
X = [1, 2, 3]
Y = [1, 2, 3]
W = tf.Variable(tf.random_normal([1], name="weight"))
# Linear model
hypothesis = X * W
# cost/loss function
cost = tf.reduce_mean(tf.square(hypothesis - Y))
# Minimize: Gradient Descent Optimizer
train = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost)
# Launch the graph in a session.
sess = tf.Session()
# Initializes global variables in the graph.
sess.run(tf.global_variables_initializer())
for step in range(101):
_, W_val = sess.run([train, W])
print(step, W_val)
- W 초기값을 5로 주었을 때,
import tensorflow as tf |
# tf Graph Input |
X = [1, 2, 3] |
Y = [1, 2, 3] |
# Set wrong model weights |
W = tf.Variable(5.0) |
# Linear model |
hypothesis = X * W |
# cost/loss function |
cost = tf.reduce_mean(tf.square(hypothesis - Y)) |
# Minimize: Gradient Descent Optimizer |
train = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost) |
# Launch the graph in a session. |
with tf.Session() as sess: |
# Initializes global variables in the graph. |
sess.run(tf.global_variables_initializer()) |
for step in range(101): |
_, W_val = sess.run([train, W]) |
print(step, W_val) |
5. (Optional) Tensoflow의 Gradient 계산 및 적용
import tensorflow as tf |
# tf Graph Input |
X = [1, 2, 3] |
Y = [1, 2, 3] |
# Set wrong model weights |
W = tf.Variable(5.) |
# Linear model |
hypothesis = X * W |
# Manual gradient |
gradient = tf.reduce_mean((W * X - Y) * X) * 2 |
# cost/loss function |
cost = tf.reduce_mean(tf.square(hypothesis - Y)) |
# Minimize: Gradient Descent Optimizer |
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01) |
# Get gradients |
gvs = optimizer.compute_gradients(cost) // optimizer의 gradients 계산 |
# Optional: modify gradient if necessary |
# gvs = [(tf.clip_by_value(grad, -1., 1.), var) for grad, var in gvs] |
# Apply gradients |
apply_gradients = optimizer.apply_gradients(gvs) // optimizer의 gradients 적용 |
# Launch the graph in a session. |
with tf.Session() as sess: |
# Initializes global variables in the graph. |
sess.run(tf.global_variables_initializer()) |
for step in range(101): |
gradient_val, gvs_val, _ = sess.run([gradient, gvs, apply_gradients]) |
print(step, gradient_val, gvs_val) |
'Deep Learning' 카테고리의 다른 글
[머신러닝/딥러닝] 파일에서 Tensorflow로 데이터 읽어오기 (0) | 2019.12.02 |
---|---|
[머신러닝/딥러닝] multi-variable linear regression을 Tensorflow 구현 (0) | 2019.11.29 |
[머신러닝/딥러닝] TensorFlow로 간단한 linear regression 구현 (0) | 2019.11.21 |
[머신러닝/딥러닝] Tensorflow 설치 및 기본 동작원리 (0) | 2019.11.19 |
딥러닝(Deep Learning) 공부방법(VoyagerX 남세동 대표) (2) | 2019.11.19 |