Deep Learning2019. 11. 21. 19:53
반응형

 김성훈 교수님의 [모두를 위한 딥러닝] 강의 정리

 - https://www.youtube.com/watch?reload=9&v=BS6O0zOGX4E&feature=youtu.be&list=PLlMkM4tgfjnLSOjrEJN31gZATbcj_MpUm&fbclid=IwAR07UnOxQEOxSKkH6bQ8PzYj2vDop_J0Pbzkg3IVQeQ_zTKcXdNOwaSf_k0

 - 참고자료 : Andrew Ng's ML class

  1) https://class.coursera.org/ml-003/lecture

  2) http://holehouse.org/mlclass/ (note)

 

1. Hypothesis and Cost

 - 기존 Hypothesis and Cost

 

 - 단순화한 Hypothesis and Cost

 

2. Gradient descent algorithm

 - cost 최소화 등 많은 최소화 문제 해결에 활용되는 알고리즘

 - cost(W,b)의 경우, cost를 최소화하는 W, b 값을 찾아냄

 - 미분을 하여 기울기가 최소가 되는 점이 cost가 최소가 되는 점

 - linear regression을 사용할 때, cost function이 convex function(볼록함수)이 되는지 확인해야 함

   -> 그래야 정확한 값 획득 가능

 

3. Linear Regression의 cost 최소화의 Tensorflow 구현

import tensorflow as tf
import matplotlib.pyplot as plt // "pip install matplotlib" 명령어 실행을 통해 사전 설치 필요 (그래프 그려주는 라이브러리)
 
X = [1, 2, 3]
Y = [1, 2, 3]
 
W = tf.placeholder(tf.float32)
 
# Our hypothesis for linear model X * W
hypothesis = X * W
 
# cost/loss function
cost = tf.reduce_mean(tf.square(hypothesis - Y))
 
# Variables for plotting cost function
W_history = []
cost_history = []
 
# Launch the graph in a session.
with tf.Session() as sess:
for i in range(-30, 50):
curr_W = i * 0.1
curr_cost = sess.run(cost, feed_dict={W: curr_W})
 
W_history.append(curr_W)
cost_history.append(curr_cost)
 
# Show the cost function
plt.plot(W_history, cost_history)
plt.show()

 

4. Gradient descent algorithm 적용

 (1) 미분을 이용한 알고리즘 수동 구현

# Minimize: Gradient Descent using derivative: W -= learning_rate * derivative
learning_rate = 0.1 // 알파 값 
gradient = tf.reduce_mean((W * X - Y) * X)
descent = W - learning_rate * gradient
update = W.assign(descent)

 

  - full code

import tensorflow as tf
 
tf.set_random_seed(777) # for reproducibility
 
x_data = [1, 2, 3]
y_data = [1, 2, 3]
 
# Try to find values for W and b to compute y_data = W * x_data
# We know that W should be 1
# But let's use TensorFlow to figure it out
W = tf.Variable(tf.random_normal([1]), name="weight")
 
X = tf.placeholder(tf.float32)
Y = tf.placeholder(tf.float32)
 
# Our hypothesis for linear model X * W
hypothesis = X * W
 
# cost/loss function
cost = tf.reduce_mean(tf.square(hypothesis - Y))
 
# Minimize: Gradient Descent using derivative: W -= learning_rate * derivative
learning_rate = 0.1
gradient = tf.reduce_mean((W * X - Y) * X)
descent = W - learning_rate * gradient
update = W.assign(descent)
 
# Launch the graph in a session.
with tf.Session() as sess:
# Initializes global variables in the graph.
sess.run(tf.global_variables_initializer())
 
for step in range(21):
_, cost_val, W_val = sess.run(
[update, cost, W], feed_dict={X: x_data, Y: y_data}
)
print(step, cost_val, W_val)

 

 (2) 텐서플로우의 Optimizer를 이용한 구현 (굳이 우리가 직접 미분하지 않아도 됨)

#Minimize: Gradient Descent Magic
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1)
train = optimizer.minimize(cost)

 

  - full code


import tensorflow as tf
tf.set_random_seed(777) # for reproducibility

# tf Graph Input
X = [1, 2, 3]
Y = [1, 2, 3]
W = tf.Variable(tf.random_normal([1], name="weight"))

# Linear model
hypothesis = X * W

# cost/loss function
cost = tf.reduce_mean(tf.square(hypothesis - Y))

# Minimize: Gradient Descent Optimizer
train = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost)

# Launch the graph in a session.
sess = tf.Session()

# Initializes global variables in the graph.
sess.run(tf.global_variables_initializer())

for step in range(101):
    _, W_val = sess.run([train, W])
    print(step, W_val)

 

 

  - W 초기값을 5로 주었을 때,

import tensorflow as tf
 
# tf Graph Input
X = [1, 2, 3]
Y = [1, 2, 3]
 
# Set wrong model weights
W = tf.Variable(5.0)
 
# Linear model
hypothesis = X * W
 
# cost/loss function
cost = tf.reduce_mean(tf.square(hypothesis - Y))
 
# Minimize: Gradient Descent Optimizer
train = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost)
 
# Launch the graph in a session.
with tf.Session() as sess:
# Initializes global variables in the graph.
sess.run(tf.global_variables_initializer())
 
for step in range(101):
_, W_val = sess.run([train, W])
print(step, W_val)

 

5. (Optional) Tensoflow의 Gradient 계산 및 적용

import tensorflow as tf
 
# tf Graph Input
X = [1, 2, 3]
Y = [1, 2, 3]
 
# Set wrong model weights
W = tf.Variable(5.)
 
# Linear model
hypothesis = X * W
 
# Manual gradient
gradient = tf.reduce_mean((W * X - Y) * X) * 2
 
# cost/loss function
cost = tf.reduce_mean(tf.square(hypothesis - Y))
 
# Minimize: Gradient Descent Optimizer
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01)
 
# Get gradients
gvs = optimizer.compute_gradients(cost) // optimizer의 gradients 계산
 
# Optional: modify gradient if necessary
# gvs = [(tf.clip_by_value(grad, -1., 1.), var) for grad, var in gvs]
 
# Apply gradients
apply_gradients = optimizer.apply_gradients(gvs) // optimizer의 gradients 적용
 
# Launch the graph in a session.
with tf.Session() as sess:
# Initializes global variables in the graph.
sess.run(tf.global_variables_initializer())
 
for step in range(101):
gradient_val, gvs_val, _ = sess.run([gradient, gvs, apply_gradients])
print(step, gradient_val, gvs_val)

반응형
Posted by CCIBOMB