[Python] finding the correct data mining approach

  • Context: Python 
  • Thread starter Thread starter eherrtelle59
  • Start date Start date
  • Tags Tags
    Approach Data Python
Click For Summary
SUMMARY

The discussion focuses on predicting website log-in times using data mining techniques in Python. The user aims to analyze months of cleaned log-in data to forecast log-ins for the next two weeks, suggesting clustering by day and hour followed by regression analysis. Resources such as the MLpy library and LOWESS (Locally Weighted Scatterplot Smoothing) are recommended for implementing these techniques. The conversation emphasizes the importance of understanding trends based on different days of the week and hours of the day.

PREREQUISITES
  • Familiarity with Python programming
  • Understanding of regression analysis techniques
  • Knowledge of clustering methods in data mining
  • Experience with time series data analysis
NEXT STEPS
  • Explore the MLpy library for data mining in Python
  • Learn about LOWESS for smoothing data trends
  • Research clustering algorithms suitable for time series data
  • Study regression analysis methods specifically for forecasting
USEFUL FOR

Data scientists, Python developers, and analysts interested in predictive modeling and time series analysis will benefit from this discussion.

eherrtelle59
Messages
25
Reaction score
0
I'm having trouble finding the correct approach to my (fairly simple) example.

Let's say I have months of data for log-in times of a certain website. The data has been selected and cleaned such that I have a list of Date_Time for each log-in.

Now, suppose I wanted to predict the log-ins for the next two weeks by day and hour, based on these past trends.

I imagine I would cluster the data by day (assuming beforehand that there will be different trends with respect to Monday vs. Friday) and make some regression analysis to predict the next two (say) Mondays.

Similarly, I could cluster by the hour and do a regression analysis to extrapolate the trend of log-ins.

Anyone know of a resource which tells you how to do this in Python? I want to keep this example fairly straightforward, but I'm open to any more ideas on how to model this behavior more efficiently.
 
Technology news on Phys.org

Similar threads

  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 11 ·
Replies
11
Views
2K
Replies
2
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 2 ·
Replies
2
Views
1K
Replies
7
Views
3K
  • · Replies 7 ·
Replies
7
Views
8K
Replies
1
Views
2K