I attended and presented at the September 27, 2010 Atlanta BI PASS Users’ Group (the local group name is “Atlanta MDF” but it is a PASS Chapter). This group’s overall leader is Teo Lachev, also based in Atlanta and known particularly for his work with Reporting Services and Analysis Services. Teo’s website is Prologika. This meeting is the second one for this BI-focused group. I was excited when Teo announced this concept, since though I appreciate presenting at SQL Server events, I believe that my data mining presentations will meet a more focused audience when the overall structure is focused on BI. By contrast, when I have presented at SAS conferences, the focus is predominantly BI and for years I have had feelings in the other direction, namely to hear more about how the server infrastructure relates to overall system success.
Our sponsor for the evening was Dundas, who paid for sufficient pizzas and soda to feed the audience of about 30 people. This meeting started with a live video conference from the Dundas team to showcase their Dundas Dashboard product. Their team explained some of the features of this product, all in Silverlight 4, and how it leans toward not just business analysts but also developers too. We could see their demo on the projected screen, and they took interactive questions from the audience.
My presentation was Data Mining with PowerPivot (Excel 2010), and was aimed at an intermediate level. Continue reading “Atlanta BI PASS September 2010 — Post-Event Wrapup” »
Data Mining with Microsoft SQL Server 2008 Review Chapter 17
This chapter covers a topic on extending data mining. Specifically, the chapter does not deal directly with user interfaces, but instead developer interfaces and machine learning algorithms. Perhaps the previous programming chapter was intended as the user interface extension. If we consider the common Model-View-Controller paradigm:
- Model — the machine learning algorithm (this chapter)
- View — the user interfaces (supposedly the last chapter)
- Controller — the means of achieving a goal (which at least will bring Analysis Services and perhaps SQL Server to the table)
In this three-part division, not much was said about view, but this chapter does talk about viewers. This chapter does not have any specific exercises or user code, since a lot of information and resources are online. I believe that the material in this chapter could fill another book. The sections below describe the main online resources for extending this technology. Continue reading “Extending SQL Server Data Mining” »
Chris Webb takes on this topic in a recent blog post:
His discussion is a summary of a longer discussion on MSDN forums:
The issue is important for data mining, since in my recent blog post, I commented on data mining dimensions and data mining cubes. You can click this link for the MSDN how-to on this topic; to summarize this help file, anyone can make data mining dimensions for certain data mining models, and BIDS automates a data mining cube which includes the data mining dimension and the dimensions and measures from the source cube. Once added to a solution, dimensions could be added to any cube, so you could take these data mining dimensions and put them all into the source cube. These data mining dimensions only apply to:
- Microsoft Clustering
- Microsoft Decision Trees
- Microsoft Association Rules
Chris Webb provided his own summary on the multiple cube topic, and on this blog post I provide my analysis. Philosophically, I view this topic as similar to the topic of whether to make one or more SQL Server databases. Continue reading “One or Multiple Analysis Services Cubes?” »
Data Mining with Microsoft SQL Server 2008 Review Chapter 11
This chapter talks about the association machine learning algorithm. The term “market basket analysis” is often used to characterize a common application, where a retailer can either physically arranage items (as in general merchandiser Walmart) which are often purchased together, may provide recommendations (as in online merchandiser Amazon) for similar products while a customer shops, or may provide coupons (as in supermarket retailer Kroger) for future purchases. The application can be not just for products but also for a combination of products and services (as with a health care facility, or with an automobile repair shop).
The outline for this blog post:
- Recap of the Authors’ Solution Example
- The Authors’ DMX Code
Recap of the Authors’ Solution Example
The authors provided an example using the Movie Click data using BIDS (Business Intelligence Development Studio). The intention of the mining structure and single mining model is to completely represent the demographic and output movie selection data. A possible use for this solution would be to make live recommendations to movie shoppers based on their demographics and based on movies (as they select them with their shopping cart).
Continue reading “Microsoft Association Rules” »
The Association for Computing Machinery produces a regular journal called SIGKDD Explorations, where SIGKDD is an acronym for Special Interest Group on Knowledge Discovery and Data Mining. I would classify the journal as academic, even though private-sector consultants or companies may be coauthoring articles.
In a recent issue, there is an article titled “Visual Analytics: How much visualization and how much analytics?”. The article makes the following claims:
- “Visual Analytics is the science of analytical reasoning supported by interactive visual interfaces.” (page 5)
- “The term Visual Analytics has been around for about five years now.” (page 5)
- “The core of our view on Visual Analytics is the new enabling and accessible analytic reasoning interactions supported by the combination of automated and visual analytics.”(page 5)
Altogether, these statements mean that Visual Analytics is a relatively new academic buzzword to define a specific field of research, namely the combination of automated analysis and visual representation. Someone might ask, how much does that description look like what people do with Excel? I would at first pass answer that Excel 2010 has exceptional graphic and visualization capabilities, but it does not inherently provide automated data analysis. However, SQL Server Data Mining adds the automated portion of this equation.
Continue reading “Visual Analytics and SQL Server Data Mining” »
Data Mining with Microsoft SQL Server 2008 Review Chapter 4
This chapter covers a complete look at how to develop data mining structures and data mining models using Business Intelligence Development Studio (BIDS). The authors’ outline Pages 127-128):
- Using BIDS
- Understanding immediate mode and offline mode
- Creating and modifying data sources, data source view, and data mining objects
- Exploring data and evaluating models
Let’s start with a fact: SQL Server Data Mining is a technology bound to SQL Server generally and Analysis Services specifically. This technology is neither a desktop nor a web application. It was created to be part of desktop and web applications, and was intended to allow people to modify and extend using XMLA and DMX. BIDS (an implementation of the extended Visual Studio) is the best way to create the DDL required for production (and enterprise level) data mining, and this chapter shows how to do that.
Continue reading “Using SQL Server Data Mining” »