Tag Archives: PowerPivot

Excel is the Number One Commercial Tool for Analytics, Data Mining, and Big Data

Annually, the respected kdnuggets.com asks their users what tools people are using for analytics, data mining, and big data.  I had seen the results summarized this morning, from a StatSoft STATISTICA website, claiming STATISTICA as the top commercial analytics product.  I went ahead and tweeted forward that finding and picture, which placed Microsoft SQL Server in seventh place.

Then, when at my desktop, I was looking for the original poll.  To my surprise, both kdnuggets.com and statsoft.com misinterpreted the results.  I will set the record straight in this post: Microsoft Excel is the number one commercial software named in this annual poll.  I will present evidence of why I come to this conclusion later in this post.

Some might dismissively claim that Microsoft Excel is just another spreadsheet.  However, smart people who have been reading this blog know that I have, for years, shown what Excel can do for data analytics:

  • Microsoft Data Mining with Excel (requiring SQL Server’s Analysis Services) can perform any of nine algorithms (Microsoft’s Decision Trees, Linear Regression, Sequence Analysis, Forecasting — Time Series, Neural Networks, Logistic Regression, Clustering, Association Analysis — Market Basket, Naive Bayes)
  • Microsoft also offers PowerPivot for Excel, allowing relational table data inside Excel 2010 workbooks — this Vertipaq technology was extended into what Microsoft recently released as “Tabular Mode” in SQL Server 2012 Analysis Services.

Now, I present the evidence for my correction.

Continue reading “Excel is the Number One Commercial Tool for Analytics, Data Mining, and Big Data” »

Recap of SQL Saturday 79 South Florida

I decided on August 9, 2011 that I wanted to go to SQL Saturday 79 in South Florida, as I blogged on previously. And was I glad I went too. I enjoyed not only being a presenter but also an attendee and participant too. This blog post has the details.

Continue reading “Recap of SQL Saturday 79 South Florida” »

Tableau Software “Data Mining” Visualizations

Tableau Software produces first-rate visualization software.  They are among the many open-source and commercial visualization products (and services) listed by KDNuggets: http://www.kdnuggets.com/software/visualization.html.

As blog readers remember from February 2011, I posted on DevExpress and their marketing of the phrase “data mining” for their visualization software.  As a result of that blog post, I had productive interaction with one of their product experts who conceded the point.  Further, as you may remember from that post, my intention was to challenge commercial vendors to put machine learning algorithms even in the “View” layer (of the MVC design pattern).

Today, I report on heavy marketing by Tableau Software to associate themselves with “data mining” largely absent of machine learning (along the way, we also catch similar Microsoft marketing for PowerPivot).  Both Tableau Software and Microsoft produce visualizations with trend lines, which arguably might be a calculated regression.  However, trend lines alone do not encompass the rich science behind machine learning algorithms, even those available in SQL Server Data Mining since 2005. The difference provides a competitive opportunity for the much-needed visualization vendors.

Visualization alone is not data mining.  If visualization were data mining, then Excel 2010 alone, with all its fancy built-in graphs, would be considered “data mining” (but read on since Excel 2010 does do nifty linear regression visualization, and Tableau Software has nice trend lines too).  Under a loose “data mining” assumption, all spreadsheets going back to my earlier favorites, Lotus 1-2-3, and VisiCalc, would be “data mining” software.  I liked Lotus 1-2-3 graphs, and seeing how they changed along with source data.  Stopping at VisiCalc circa 1983 does NOT promote the incredible machine learning science developed since then.  And for C-level executives and venture capitalists looking to invest in the next big “data mining” systems, they should not be paying for just 1985 technology.  Its 2011, invest your money more wisely.

In this blog post:

  • A demonstration of how Tableau Software is marketing their “data mining” visualizations
  • An example of how someone used Tableau Software to connect to SQL Server Data Mining
  • A challenge to visualization entrepreneurs to incorporate machine learning into their software
  • My own gasoline data example discussing how to see the known and unknown

I have a variety of people reading this blog post including:

  • Analysts who use data mining to produce models
  • C-level executives and venture capitalists wanting to know what to look for in visual analytics software
  • Visualization developers looking for that next competitive edge in the growing business intelligence industry

Someone might be in all these groups, but hopefully my comments will help you explain this “data mining” issue to other groups.

Continue reading “Tableau Software “Data Mining” Visualizations” »

Data Mining Implications of Gartner’s 2011 Projections

Industry analysis organization Gartner announced four major trends for the next few years. This blog post projects implications for data mining (in general).

Prediction #1: “By 2013, 33 percent of BI functionality will be consumed via handheld devices.”

From their text, they include the tablet, which you may remember having seen from previous-released laptops with flippable screens (to make them a tablet with a stylus). The obvious technology which has driven this topic is the Apple iPad, but others are on the way, and vendors are trying to cash in on the wave of renewed interest in tablet devices. I recently purchased a small Asus computer, larger than a netbook, but allowing me to blog from just about anywhere. Asus also made a tablet announcemet at the Computer Electronics Show recently.

Continue reading “Data Mining Implications of Gartner’s 2011 Projections” »

Vertipaq, Analysis Services, and Data Mining

Rob Collie formerly worked for Microsoft, and now blogs at PowerPivotPro.com.  His blog is included in my Recommended Blogs for 2011.  Late last year, he posted his comments on Vertipaq and Analysis Services.  I would expect my regular reading audience to be used to models, so let’s start with Rob’s graphic from http://powerpivotpro.com/2010/11/12/five-observations-from-sql-pass/:

1 6 2011 2 29 42 PM Vertipaq, Analysis Services, and Data Mining

In the blog post, Rob tells the story which we are all knowing now:  

  • PowerPivot is an expression of the Vertipaq technology as an Excel add-in
  • Vertipaq will be moved into Analysis Services along with the DAX language
  • The rapid adoption of PowerPivot so far provides a confidence that improvements in this area will continue

Continue reading “Vertipaq, Analysis Services, and Data Mining” »

Windows Azure Marketplace DataMarket and Data Mining

This project was formerly codenamed Dallas, but now has a name, story and content too.  In this post I introduce the story and link its relevance to SQL Server Data Mining.

12 31 2010 2 59 38 PM Windows Azure Marketplace DataMarket and Data Mining

Yes, I admit I use technologies that compete with Microsoft, and I am about to make a comparison.  Data mining requires content, and you may already have your own content stored in a database (like SQL Server) or in a cube (like SQL Server Analysis Services).  What I have liked about my iPod and iPhone is that the iTunes application allows me to add content, predominantly music, but also all types of content.  Windows Azure Marketplace DataMarket is intended as a way to add additional content.  In this post I show how I went through the journey of adding free content from the Windows Azure Marketplace DataMarket.

Continue reading “Windows Azure Marketplace DataMarket and Data Mining” »

SQL Server Data Mining and Apollo Columnstore Indexes

Note: This post was revised November 12, 2010 to clarify the brand names Apollo and VertiPaq (thanks Denny Lee of SQLCAT) — and I extended comments on Amir Netz’s C++ versus C# analogy which I believe clarifies the discussion between what I have termed managed and unmanaged aggregations.  

This week’s PASS Summit conference included several demonstrations and announcements of the next version of SQL Server, version 11, codenamed Denali. In this blog post I have the following goals:

  • Outline Apollo columnstore indexes as a competitive Microsoft technology
  • Respond to Microsoft claims about the comparative performance advantages of columnstore indexes specifically for aggregations
  • Respond to Chris Webb’s multiple blog posts (posted from Seattle, WA) about the future of SQL Server Analysis Services

These topics seem like a lot to take on in one blog post, but in context, Microsoft found a way to introduce columnstore indexes in an 8 page whitepaper. As regular blog readers know, I put on my scientific hat first when trying to distinguish science from science fiction…

Continue reading “SQL Server Data Mining and Apollo Columnstore Indexes” »

SQL Saturday 48 Columbia SC — Post Event Wrapup

I was among the presenters at October 2, 2010 SQL Saturday in Columbia SC. This post has a recap of the event, and toward the end I have my slides from the event.

The speaker dinner was a gift to speakers at the Grecian Gardens restaurant in West Columbia. Most people ordered Italian dishes off the menu, but I had a beef Greek dish which was good. About 25 people were there, and of the group immediately around me, many are on Twitter:

The point being that if you want be a speaker at a Microsoft conference, it helps to be using Twitter (or at least just get an account). During dinner we talked again about BI (Business Intelligence) and what does “self-service BI” mean to people.

MarkTab: Self-Service BI Means Excel

Excel has been around for decades, and I would argue is the primary self-service business intelligence tool. Continue reading “SQL Saturday 48 Columbia SC — Post Event Wrapup” »

Raman Iyer Interview — Microsoft Analysis Services

Hi Everyone

I’m excited to present the first interview that I am posting to this blog. In these interviews, I’m generally looking for leaders and influential voices in business intelligence in general and data mining in particular.

This first interview is with Raman Iyer, and experienced leader and developer for Microsoft. Raman chose to write responses to my questions, and he included links too (knowing that the product would be web-based). You can find Raman supporting http://www.sqlserverdatamining.com and often responding to technical questions on MSDN Forums.

ramaniyer Raman Iyer Interview    Microsoft Analysis Services

Quick bio:
Raman Iyer is Principal Development Manager for the Analysis Services Engine development team, responsible for building the server that powers Microsoft’s core Business Intelligence offerings in SQL Server, including OLAP, PowerPivot (In-Memory BI) and Data Mining. He was a founding member of the SQL Server Data Mining team, developing early prototypes and core DM engine features in the 2000 and 2005 releases before going on to lead the Data Mining development team through the SQL Server 2008 and DM Add-in releases.

How did you come to work for Microsoft?

A long time ago, in a galaxy far, far away (now part of the SAP universe), I was happily geeking out Continue reading “Raman Iyer Interview — Microsoft Analysis Services” »

Atlanta BI PASS September 2010 — Post-Event Wrapup

Hi Everyone

I attended and presented at the September 27, 2010 Atlanta BI PASS Users’ Group (the local group name is “Atlanta MDF” but it is a PASS Chapter).  This group’s overall leader is Teo Lachev, also based in Atlanta and known particularly for his work with Reporting Services and Analysis Services.  Teo’s website is Prologika.  This meeting is the second one for this BI-focused group.  I was excited when Teo announced this concept, since though I appreciate presenting at SQL Server events, I believe that my data mining presentations will meet a more focused audience when the overall structure is focused on BI.  By contrast, when I have presented at SAS conferences, the focus is predominantly BI and for years I have had feelings in the other direction, namely to hear more about how the server infrastructure relates to overall system success. 

Our sponsor for the evening was Dundas, who paid for sufficient pizzas and soda to feed the audience of about 30 people.  This meeting started with a live video conference from the Dundas team to showcase their Dundas Dashboard product.  Their team explained some of the features of this product, all in Silverlight 4, and how it leans toward not just business analysts but also developers too.  We could see their demo on the projected screen, and they took interactive questions from the audience.

My presentation was Data Mining with PowerPivot (Excel 2010), and was aimed at an intermediate level.  Continue reading “Atlanta BI PASS September 2010 — Post-Event Wrapup” »