Choice of Pipelines for Data Analysis

Click For Summary

Discussion Overview

The discussion revolves around the choice of data analysis pipelines, specifically comparing traditional methods like Linear/Multilinear Regression using Python Pandas with approaches using TensorFlow (TF). Participants explore the implications of using different frameworks and the potential outcomes based on model structure and activation functions.

Discussion Character

  • Exploratory, Technical explanation, Debate/contested

Main Points Raised

  • One participant questions whether TensorFlow would produce the same output as traditional methods given the right choice of activation functions, suggesting that this is a primary variable in the analysis.
  • Another participant argues that if the model structure, metrics, and data normalization are consistent, TensorFlow should converge on the same coefficients as traditional methods, but notes that hidden layers in neural networks introduce non-linear fits that may enhance performance.
  • There is a humorous exchange regarding the meaning of "TF," with participants jokingly interpreting it in different ways.
  • Resources are shared, including book recommendations that may assist in understanding machine learning pipelines and project setups.

Areas of Agreement / Disagreement

Participants express differing views on the convergence of TensorFlow and traditional methods, indicating that while some believe they can yield similar results, others highlight the advantages of neural networks in terms of performance.

Contextual Notes

The discussion does not resolve the assumptions regarding the impact of activation functions or the influence of hidden layers on model performance. There are also references to specific tutorials and resources that may not be universally applicable.

Who May Find This Useful

Individuals interested in data analysis, machine learning frameworks, and those exploring the differences between traditional statistical methods and modern neural network approaches may find this discussion relevant.

WWGD
Science Advisor
Homework Helper
Messages
7,785
Reaction score
13,040
TL;DR
What kind of rules of thumb are there to decide choice of pipeline?
Hi,
So say I have some data to process. I am trying, say, Linear/Multilinear Regression. I know how to do this within Python Pandas. I can learn how with Tensorflow (TF). Would TF produce the same output given the "right" choice of Activation Functions *? Or would it output a model that is somehow "More General"?

* I assume this is the only/main variable affecting this choice and not other variables such as choice of metrics, sessions, etc.
 
Physics news on Phys.org
Given the same model structure (choice of metrics, properly normalized data, loss function) I don't see why TF should not converge on the same coefficients as a traditional calculation. However with a neural network we introduce hidden layers that create a non-linear fit which in most cases will perform better.

Have you worked through the Keras tutorial on the fuel efficiency dataset?
 
  • Like
Likes   Reactions: WWGD
Does TF stand for TensorFlow or The f$%*? ;). Thanks for your answer. Will look up the link; thanks.
 
  • Haha
Likes   Reactions: pbuk
WWGD said:
Does TF stand for TensorFlow or The f$%*? ;).
I must admit to having used the words "why won't you converge you f$%*?" or similar on a number of occasions.
 
Last edited:
  • Like
Likes   Reactions: PhDeezNutz and WWGD

Similar threads

  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 13 ·
Replies
13
Views
4K
  • · Replies 24 ·
Replies
24
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 26 ·
Replies
26
Views
3K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 5 ·
Replies
5
Views
4K