Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

[Python] finding the correct data mining approach

  1. Jul 20, 2012 #1
    I'm having trouble finding the correct approach to my (fairly simple) example.

    Let's say I have months of data for log-in times of a certain website. The data has been selected and cleaned such that I have a list of Date_Time for each log-in.

    Now, suppose I wanted to predict the log-ins for the next two weeks by day and hour, based on these past trends.

    I imagine I would cluster the data by day (assuming beforehand that there will be different trends with respect to Monday vs. Friday) and make some regression analysis to predict the next two (say) Mondays.

    Similarly, I could cluster by the hour and do a regression analysis to extrapolate the trend of log-ins.

    Anyone know of a resource which tells you how to do this in Python? I want to keep this example fairly straightforward, but I'm open to any more ideas on how to model this behavior more efficiently.
  2. jcsd
  3. Jul 21, 2012 #2


    User Avatar
    Science Advisor

  4. Jul 21, 2012 #3
    There is also lowess.
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook