Showing posts with label data mining. Show all posts
Showing posts with label data mining. Show all posts

Thursday, 16 April 2009

What is data mining?

After a couple of posts about coal and diamonds I thought it might be a good idea to post a straightforward answer to the question: What is data mining?

Data mining is the applications of statistical techniques and artificial intelligence to find patterns in data that are not apparent using queries or other database techniques. Data patterns can provide insights into behaviours and trends that would otherwise remain hidden. Data mining is perhaps more descriptively known as knowledge discovery in data.

The statistical techniques are run as software programmes which allow parameters to define how the algorithms are applied. The pattern-finding process can be run on different data sets and with different parameter settings. Models can be refined to improve the accuracy of the results.

Although data mining algorithms can be run on any data file, they are often applied to files where data has been brought together from a number of different sources. Different statistical techniques, or algorithms, are suited to different types of data, and different problems.

The basic premise of data mining is that predictions can be made about the future from a sample of past behaviour, ie the existing data files. For example, theatre bookings together with other information about those who made the bookings can be used to find patterns, and predict what type of productions they might book in the future. Segments can be found and different marketing messages sent to them according to their profile.

Data mining is the automatic or semi automatic means of finding patterns and making predictions.

Data mining has now been built into Microsoft’s SQL Server database: starting with two algorithms in SQL Server 2000, extended to 7 algorithms in SQL Server 2005, and with some further enhancements in SQL Server 2008.

Get in touch if you would like to find out whether your data files are suitable for data mining.

Tuesday, 7 April 2009

Diamonds are a girl's best friend

I wrote about data mining the other day and got some interesting comments. One was that my analogy of gold wasn’t quite accurate. Analogies are dangerous things.

The comment was quite correct, though, because although data mining can turn up information that is of significant value, it takes work to get there. The new analogy was offered - that of an uncut diamond. A diamond is as unappealing as coal in its raw state – but much sought-after in its cut and polished state.

So it is with data mining – gold coins do not drop into your lap as if you were playing a slot machine, you have to work with the new knowledge to figure out whether you have an uncut diamond or a piece of coal. It could be either – and of course an uncut diamond in the hands of someone who doesn’t know what to do with it might as well be coal.

A business intelligence specialist told me recently that he felt uncovering new knowledge from data was only part of the solution –the remainder being to display the data clearly and to communicate its meaning in an effective way.

Whilst the underlying data mining or statistical skills clearly have to be present, there is an element of polishing and crafting the newly found information to let its brilliance be seen. Perhaps what we need are some rather different skill sets in getting the full message across: communication and visionary skills.

It also goes some way to explain why business intelligence, data mining, performance management, and data visualisation fit together so well.

Thursday, 2 April 2009

Data mining - digging for gold

Just as coal is the work-horse of modern energy production, so the relational database is the work-horse of modern business. Alright, one is black and dusty and the other is, well, virtual and clean, but the end result is the same - reserves waiting to be mined, whether the reserves are coal or data.

Reserves which may contain gold for their owners.

Before you put on your hard hat with the lamp on the front to break open the server, I’m talking metaphorical gold - metaphorical gold which could be worth a great deal more to your business than the real thing.

First – let’s consider the reserves, which unless you are actually a mining company will be the data stored in databases within your company. Then, let’s look at what the gold might be that’s hidden in the data.

There are few businesses which have not installed a database, whether for managing customers, accounts or stock. As more enterprise-wide systems are installed the amount of data being generated is phenomenal. Some of that data will be immediately accessible through reporting tools. But what could happen if those databases were joined together? What if you could see the sales information together with the customer management information? Or the training data together with sales data? At the risk of mixing enough metaphors to make soup, that would really be cooking with gas ….

But whether it’s one database, or a number joined together, how do you go about looking for gold? Indeed what does gold look like in data terms?

How you find gold is by using a technique called data mining, and what it looks like all depends on your business. It may be customers who are more likely to book a particular type of show in your theatre, or finding which products to bundle together to maximise sales and profit. Or it could be something completely different – depending on what business you are in. The applications for data mining are many and varied and are limited only by business owners' imagination and ambitions.

Data mining is now more accessible and affordable than ever. Products such as Microsoft SQL Server 2008 and 2005 put data mining within the reach of most companies – large or small.

Get in touch if you want to dig for gold in your data. Hard hats with lamps supplied
.