[Python] finding the correct data mining approach

eherrtelle59 · Jul 20, 2012

I'm having trouble finding the correct approach to my (fairly simple) example.

Let's say I have months of data for log-in times of a certain website. The data has been selected and cleaned such that I have a list of Date_Time for each log-in.

Now, suppose I wanted to predict the log-ins for the next two weeks by day and hour, based on these past trends.

I imagine I would cluster the data by day (assuming beforehand that there will be different trends with respect to Monday vs. Friday) and make some regression analysis to predict the next two (say) Mondays.

Similarly, I could cluster by the hour and do a regression analysis to extrapolate the trend of log-ins.

Anyone know of a resource which tells you how to do this in Python? I want to keep this example fairly straightforward, but I'm open to any more ideas on how to model this behavior more efficiently.

chiro · Jul 21, 2012

Hey eherrtelle59.

You should probably take a look at this:

http://mlpy.sourceforge.net/

gsal · Jul 21, 2012

There is also lowess.

[Python] finding the correct data mining approach

SUMMARY

PREREQUISITES

NEXT STEPS

USEFUL FOR

Similar threads

Sweetspot of data compression

Other than just FizzBuzz to test programmer candidates

How to show RS(U+TRS)* is equivalent to (R+SUT)SU?

HTML/CSS Problems with DNS records

PHP My website presents the visitor with the choice of opting out of using cookies....

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect