Data Mining and base knowledge.

Who is online?  0 guests and 0 members
Home  »  Forums   »  microsoft business intelligence   »  data mining   » Data Mining and base knowledge.

Data Mining and base knowledge.

Topic RSS Feed

Posts under the topic: Data Mining and base knowledge.

Posted: 1/24/2010

Jedi Knight 1516  points  Jedi Knight
  • Joined on: 1/3/2010
  • Posts: 266

Hi.

 

With the recent article from Devin about mining, he has made me remeber that I was about to ask one thing to all of you about this subject.

 

SSAS provides data mining. Are there a wide variety of models provided by SSAS, or just the most known ones? And people doing ssas data mining, are people with deep knowledge of mining or SSAS provides easy tools so anyone can create mining models?

 

I ask you this because I see you people from US working a lot with this tool and here in Portugal, unfortunatly,  I've noticed that decision makers (and companies) are not sensitize to use this great tool.

 

This is one question. The other is that I've started a master degree in business intelligence but because of professional reasons I decided to quit. In here I've learn a lot from mining models like clusters, neuronal net, and many others. But what I want to ask is, even if I hadn't learn anything about mining, can anyone create a ssas mining model? Or you must have knowledge to decide wich mining model fits better to the business where we are working on?


tags models, mining, ssas

Posted: 2/24/2010

Jedi Master 3999  points  Jedi Master
  • Joined on: 10/28/2009
  • Posts: 25

There are a number of built in models  as well as the ability to create your own. It's worth checking out. SQLServerDataMining.com is a good place to get started !


Posted: 1/24/2011

Jedi Youngling 10  points  Jedi Youngling
  • Joined on: 1/24/2011
  • Posts: 5

Hi Marco

In addition to Adam's answer I will address your specific questions.

  • SQL Server Data Mining provides ten distinct ways to implement data mining, as I documented on my Machine Learning page: http://marktab.net/About/MachineLearning.aspx
  • Yes, anyone can create mining models, and the method I recommend is to start with Excel. See http://marktab.net/About/ExcelandDataMining.aspx
  • I also provided some sample implementations on my Getting Started with Microsoft Data Mining page: http://marktab.net/About/GettingStarted.aspx
  • Glad to hear you are from Portugal -- in the past I have worked on some data mining projects with people in Spain, for a large media company in Great Britain
  • My single best book recommendation is Data Mining with Microsoft SQL Server 2008 (MacLennan, Tang, Crivat) -- I provided a review and extended comments (including PowerShell code) on my blog:  http://www.marktab.net/datamining/?s=%22data+mining+with+microsoft+sql+server+2008%22
  • "Must" people have knowledge to determine the model of best fit?  The word "must" is too strong for any scientific conclusion.  Instead, the word "best" describes the goal to seek.  Data mining is inherently probabilistic, and therefore people make their best conclusion on all known quantitative and qualitative information.  Data mining does not make decisions,  but rather informs people to make best decisions.  Better knowledge of the problem domain and the mathematics behind the machine learning algorithms improves the probability of making the best decision possible.

Posted: 2/3/2011

Padawan 200  points  Padawan
  • Joined on: 12/1/2009
  • Posts: 40

Yes, anyone can do data mining, given the tools we have at our disposal.  But, if you want to apply data mining and know that your results are appropriate and actionable, I think you have to know a good deal about your data -- it's distribution, in particular -- and what data mining model to apply for a given problem. 

That's my opinion, at least.  Maybe it's the engineering background in me that gives me pause when I try to predict hospital discharge costs using a standard data mining algorithm on data that is bimodal and has a long tail -- is it appropriate to apply this model?  What were the assumptions behind this model?  Can I trust the result?  Only way to know is to learn more about the models and the theory.

Towards this end I am looking into formal education.  MSDN is only going to give you so much -- you really need more, I think.  Two online programs are available at Stanford and Central Connecticut State University. 

There is also an interesting course called "Machine Learning" available via YouTube and on iTunesU. I also found it here:  http://academicearth.org/courses/machine-learning .  It is the recording of the Machine Learning course in 2008 at Stanford.  I've only listened to the first class broken up through a couple of drives in to work, but it is already fascinating.  And, I recall somewhere in there the professor talking about how, as an expert in data mining, it is not uncommon for him to consult with a company where a data mining project has been going on for a month, and he can tell immediately that all their work has been for nought, due to them not having some basic understanding of how to apply the algorithms.

 


Posted: 2/3/2011

Jedi Youngling 10  points  Jedi Youngling
  • Joined on: 1/24/2011
  • Posts: 5

 

stirone said:

 

Yes, anyone can do data mining, given the tools we have at our disposal.  ...  Maybe it's the engineering background in me that gives me pause when I try to predict hospital discharge costs using a standard data mining algorithm on data that is bimodal and has a long tail -- is it appropriate to apply this model?  What were the assumptions behind this model?  Can I trust the result?  Only way to know is to learn more about the models and the theory.

 

 

Thanks for the input.  I have specifically recommended one text book which describes algorithms from the Microsoft BI stack, and I have encouraged my blog readers to read it.  There are other good books too, but I believe this one is good for business analysts.

My book review: http://www.marktab.net/datamining/index.php/2010/11/05/data-mining-for-business-intelligence-book-review/

My interview with the author: http://www.marktab.net/datamining/index.php/2010/11/17/galit-shmueli-interview-lead-author-of-data-mining-for-business-intelligence/

 


Posted: 3/10/2011

Jedi Youngling 2  points  Jedi Youngling
  • Joined on: 3/10/2011
  • Posts: 1

Excel 2007 has an Add In that makes data mining via SSAS easy IMHO:

Microsoft SQL Server 2005 Data Mining Add-ins for Microsoft Office 2007

Microsoft SQL Server 2008 Data Mining Add-ins for Microsoft Office 2007

By far the most straightforward and an easy one to set up is the classification/decision tree.  I've also used clustering, and rarely the forecast.

How about this though, when you install one of the add-ins:

1) Click on the new Data Mining menu in Excel

2) Click on "connection" on the right side of the ribbon and set up a connection to your SSAS server

3) Go thru the Classify wizard to view a decision tree, the data can simply be an Excel spreadsheet (columns headings hold the variable names, e.g. store, product, category, sales, and rows hold each individual sale)

In short, fire up Excel 2007, install either the SQL Server 2005 or 2008 add-in, create a dummy data set on a worksheet, and start playing around.  Should be a couple hours at most before you are looking at a decision tree within Excel.

--Bob harford


Page 1 of 1 (6 items)