Logistic regression: Stochastic Gradient Ascent (in python)

In summary, the conversation discusses following an online course in machine learning offered by Stanford University and reading up on logistic regression and stochastic gradient ascent. The issue with the log likelihood function constantly decreasing rather than increasing after every change in theta is noted, with a small learning rate of 0.0001 being used to avoid nan values. The code for stochastic gradient ascent and log likelihood function are also provided.
  • #1
NATURE.M
301
0
So I've been following through a online course in machine learning offered by Stanford university. I have been recently reading up on logistic regression and stochastic gradient ascent. Here is a link to the original notes: http://cs229.stanford.edu/notes/cs229-notes1.pdf (pages 16-19).

Here is my code. The issue I'm having is that the log likelihood function seems to be constantly decreasing (becoming more negative) rather than increasing after every change in theta. The value for alpha (learning rate) I'm using is 0.0001 which is small, however anything larger will produce nan values as output from the log likelihood function. I spent quite some time playing around with this value and looking over my code and I can't seem to figure out what's going wrong.

Code:
import numpy as np

def logregg(X, y, alpha, iterations):
    #output parameter vector

    (m, n) = np.shape(X)
    theta = np.array([0] * n, 'float')
    lin = np.dot(X, theta)
    hyp = 1./(1 + np.exp(-1*lin))
    i = 0
 
    #stochastic gradient ascent
    while i < iterations:
        for j in range(m):
            theta = theta + alpha*(y[j] - hyp[j])*X[j,:]
            print loglik(X, y, theta)
        i+=1
        
    return thetadef loglik(X, y, theta):
    lin = np.dot(X, theta)
    hyp = 1./(1 + np.exp(-1 * lin))
    first = y*np.log(hyp)
    second = (1-y)*np.log(1-hyp)
 
    return np.sum(first + second)
 
Technology news on Phys.org
  • #2
X = np.array([[1,2], [2,3], [3,4], [4,5], [5,6]])y = np.array([1, 0, 1, 0, 1])theta = logregg(X, y, 0.0001, 10)print theta
 

1. What is logistic regression and how does it differ from linear regression?

Logistic regression is a type of classification algorithm used to predict categorical outcomes based on a set of independent variables. It differs from linear regression in that the output of logistic regression is a probability between 0 and 1, while the output of linear regression is a continuous numerical value.

2. What is the stochastic gradient ascent algorithm and how is it used in logistic regression?

The stochastic gradient ascent algorithm is a variation of the gradient ascent algorithm used to optimize the parameters of a logistic regression model. It updates the parameters after each individual data point, rather than after every iteration, making it faster and more efficient for larger datasets.

3. Can logistic regression be applied to both binary and multiclass classification problems?

Yes, logistic regression can be used for both binary and multiclass classification problems. For binary classification, a single logistic regression model is used to predict the probability of one class versus the other. For multiclass classification, multiple logistic regression models are used to predict the probability of each class and the class with the highest probability is chosen as the final prediction.

4. How is the performance of a logistic regression model evaluated?

The performance of a logistic regression model is evaluated using metrics such as accuracy, precision, recall, and F1 score. These metrics compare the predicted values to the actual values and measure the model's ability to correctly classify data points.

5. Can logistic regression handle non-linear relationships between variables?

No, logistic regression assumes a linear relationship between the independent variables and the log odds of the dependent variable. If there are non-linear relationships, data preprocessing techniques such as feature engineering or using a non-linear model may be necessary to improve the performance of the logistic regression model.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
854
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
1K
  • Programming and Computer Science
Replies
1
Views
1K
  • Advanced Physics Homework Help
Replies
1
Views
3K
Back
Top