Segway for Presentation: SQl Server/Database and Data Mining

Click For Summary
For a presentation on SQL Server and Data Mining for a job interview, leveraging SSAS (SQL Server Analysis Services) and SSDT (SQL Server Data Tools) is essential. SQL and t-SQL are utilized for data extraction and manipulation, while machine learning programs can involve various languages, with a focus on strategies like neural networks to identify customer behaviors. An example includes training a neural net to predict customer dissatisfaction based on past data. Apache Spark is recommended for efficient data processing and distribution in machine learning tasks, particularly when dealing with large datasets. The discussion emphasizes the importance of denormalizing data for mining purposes, as this can enhance performance compared to normalized databases. Additionally, it's advised to approach the interview with confidence, acknowledging gaps in knowledge while expressing a willingness to learn and engage in a dialogue about data mining strategies.
WWGD
Science Advisor
Homework Helper
Messages
7,771
Reaction score
12,990
Hi All,
I need to do a presentation in the subject areas of SQL Server, general Database into the area of Data Mining for a job interview. Any Ideas? First thought is the use of SSAS ( Analysis Service) and SSDT ( Data Tools) from SQL Server in Data Mining. But this does not seem clear-enough to me. I am not sure whether one actually uses SQL or t-SQL in either of these platforms. What language are the Machine-Learning (ML) programs written?
Any other suggestions?
Thanks.
 
Computer science news on Phys.org
Read up on Apache Spark and how it distributes the work of machine learning.

https://en.m.wikipedia.org/wiki/Apache_Spark

You might get some general data mining insight from this IBM Redbook

http://jliusun.bradley.edu/~jiangbo/Redbooks/sg245252IMGuide.pdf

Data mining uses different strategies to find trends in data. One such strategy is to train a neural net to identify customers with a certain trait or behavior using other customers who've demonstrated that behavior or have that trait.

As an example, we have a database of customers to a bank. We want to find out who are dissatisfied with the bank and are thinking of leaving. We train a neural net using customers who have left and ask it to score the remaining customers and then we select out those customers with the highest scores and try to market me bank product to them to keep them as a customer.

SQL is used to extract the customers into a file and the mining tools process the file outputting a score. SQL is used to add the score back to the database. SQL is used to extract the high scoring customers for our marketing campaign.

Apache Spark could be used to manage the mining process is a more efficient distributed fashion,
 
Last edited:
  • Like
Likes WWGD
Thanks, Jedi, I assume if we have an OLTP setup we would want to denormalize, while if we have an OLAP, we may want to eliminate redundancy and we do keep a normalized database? EDIT: I will owe you if I get my ( entry level) big data job.
 
Last edited:
Yes, that's basically it. We found that SQL queries to collect all the data from the various star schema tables while data mining was far slower than making a flat denormalized file of data to mine. I think this is still true and is used by Apache Spark as it distributes the data across the network machine.
 
  • Like
Likes WWGD
I'm no longer in the data mining area. I moved on to scientific programming a few years ago but we're looking at using Apache Spark for a project. However, who knows what'll happen next.

Cheers, take care. Good luck with the job interview, try not to get bogged down in the details of their questions and answer honestly and confidently as they can't expect you to know everything about data mining but just knowing the terms and strategies will convince them.

Remember when you don't know something say so and then say you'll definitely review that or research that. Try to change things into a dialog instead of a question answer with you providing suggestions on how you can help them with their work.
 
  • Like
Likes WWGD

Similar threads

Replies
7
Views
3K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 133 ·
5
Replies
133
Views
10K
  • · Replies 10 ·
Replies
10
Views
4K
Replies
2
Views
3K
  • · Replies 1 ·
Replies
1
Views
4K
Replies
2
Views
5K
  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 7 ·
Replies
7
Views
8K