Personal tools

Data Science

Fleet Week 2014_1000276.jpg
(US Navy Blue Angels, San Francisco Fleet Week 2014 - Jeff M. Wang)


Data Science is Team Work!



Data Science


Data Science is about extracting knowledge from data. It is about methods to turn high-volume data and fragmented information into actionable knowledge. How can we design robust, principled models to combine complex data sets with other knowledge sources?  How can we design models that summarize and generate hypotheses from such data?  How can we characterize the uncertainty in large, heterogeneous data to provide better support for decisions?

Data science, also known as data-driven science, is an interdisciplinary field about scientific methods, processes, and systems to extract knowledge or insights from data in various forms, either structured or unstructured, similar to data mining.  It can be thought of as a basis for empirical research where data is used to induce information for observations. These observations are mainly data (or big data) related to a business or scientific case. 

Insight, the data products of data science, is extracted from a diverse amount of data through a combination of exploratory data analysis and modeling. However, data science is not static. It is not one time analysis. It involves a process where models generated to lead to insights are constantly improved through further empirical evidence, or simply, data. Using data science and analysis of the past and current information, data science generates actions. This is not just an analysis of the past, but rather generation of actionable information for the future (or a prediction), like the weather forecast.

Machine learning is the core step in data science in which we deploy machine learning methods and statistics methods to get knowledge and to learn models from the data. So these models could be either classification models, clustering models, regression, density estimation, and so on and so forth.


Building a Big Data Team and Strategy


In reality, data scientists are teams of people who act like one. A data science team often comes together to analyze situations, business or scientific cases, which none of the individuals can solve on their own. There are lots of moving parts to the solution. But in the end, all these parts should come together to provide actionable insight based on big data. Being able to use evidence-based insight in business decisions is more important now than ever. Data scientists have a combination of technical, business and soft skills to make this happen.

When building a big data strategy, it is important to integrate big data analytics with business objectives. Communicate goals and provide organizational buy-in for analytics projects. Build teams with diverse talents, and establish a teamwork mindset. Remove barriers to data access and integration. Finally, these activities need to be iterated to respond to new business goals and technological advances.


[More to come ...]



Document Actions