Be a citizen data scientist: Tools democratizing data science

Analytics vendors and non-technical employess are democratizing data science. Organizations are looking at every employee as a data scientist so that they can bring their expertise on a business problem.

Most industry analysts are highlighting the increased role of citizen data scientists in organizations:

  • IDC big data analytics and AI research director Chwee Kan Chua mentions in an interview: “Lowering the barriers to allow even non-technical business users to be ‘data scientists’ is a great approach.”
  • According to Gartner, data scientists create models that use advanced diagnostic/predictive/prescriptive  analytics. Their primary job function is outside the field of statistics and analytics.

Why are there more citizen data scientists now?

These trends support democratization of analytics:

What are the tools used by citizen data scientists?

Citizen data scientists first need to access business data from various systems. For example, is a self-service data reporting tool which allows employees to pull data from various databases for easy analysis and automated reporting. We have listed and other solutions for reporting. Read more


Machine Learning Accuracy: Learn the Metric to Assess ML Models [’20]

Demonstrates the 4 possible categories of results of a model: True positive, false positive, true negative, false negative

There are various theoretical approaches to measuring accuracy* of competing machine learning models however, in most commercial applications, you simply need to assign a business value to 4 types of results: true positives, true negatives, false positives and false negatives. By multiplying number of results in each bucket with the associated business values, you will ensure that you use the best model available.

Further complicating this situation is the confidence vales provided by the model. Almost all machine learning models can be built to provide a level of confidence for their answer. A high level approach to using this value in accuracy* measurement is to multiply it with the results, essentially rewarding the model for providing high confidence values for its correct assessments. Read more


60 Top Data Science Tools: In-depth Guide [2020 update]

Data science tools are evolving. There are 2 classes of tools emerging:

  1. Self-service tools for those with technical expertise (programming skills and understanding of statistics and computer science)
  2. Tools for business users that automate commonly used analysis

Learn the most popular data science tools for techies

Becoming data scientist is hard. In any hard task, focus is critical. As a data scientist, Python should probably be the first tool you should master.

Kaggle, the community for data science competitions, publishes surveys of data scientist such as their “2017 the State of Data Science” report. Below, you can find the most popular tools from their survey:

Python and R are the top performers. Other sources such as KDNuggets’ poll results also support this: Read more


Data Science Consulting & Consultants in 2020: In-depth Guide

Graph of Google search trends on data science

Interest in data science grew >5x during the last 5 years as you can see above.

However it is still not clear to many how data science consulting is different than regular consulting. After all, consulting is supposed to be about making data-driven decisions. A critical difference is that data science consultants leave their clients with reusable operational models. However, most regular consulting projects answer important but one-off questions and do not leave clients with operational decision making models. Read more