What is Data Science?
Data Science is the field of study that combines the technical capacities of computer science and mathematics with domain-specific knowledge to extract useful insights from data. The term 'data science' can be used as a catch-all phrase for the broad field that includes data engineering, data analytics, machine learning, artificial intelligence, and business intelligence. Data science can be described as a superset of statistics. As a discipline, it draws heavily from both statistics and linear algebra.
Computer
Science
Mathematics
(Stats/LinA)
Domain
Knowledge
Data
Science
Data science delivers value to organizations by enhancing the intelligence picture to support better decision making. Professional data scientists discover and clearly communicate information that organizations use to improve operational performance through more precisely targeted resource use, identification of previously unknown markets, and improved alignment of product/service with customers.
This effort results in reduced expenses, increased revenue flow, higher profit margins, stronger teams, and more effective marketing campaigns. A significant goal of data science is to discover information for a company to use to achieve a better competative stance in the marketplace.
A Data Science Project Workflow
Identify
Identify Specific Problem to Solve
Involve Stakeholders
Business Understanding
Select Analytic Approach
Descriptive ✱ What happened?
Diagnostic ✱ Why did it happen?
Predictive ✱ What will happen?
Prescriptive ✱ What action to take?
Collect
Collect Data
Software Engineering
Data Requirements, Collection, Mining, Exploration, Understanding, Cleaning, Preparation
Instrumentation, Logging, Sensors, External Data, User Generated Content
Process
Process Data
Data Engineering
Reliable Data Flow, Infrastructure, Pipelines, ETL (Explore, Transform, Load), Structured and Unstructured Data Storage
Cleaning, Wrangling, Anomaly Detection, Preparation
Label
Aggregate/Label Data
Data Science Analytics
Analytics, Metrics, Segments, Aggregates, Features, Training Data Preparation
A/B Testing, Experimentation
Model
Build Data Model
Machine Learning
Feature Engineering, Model Training, Evaluation, Deployment, Monitoring, Assessment, Optimization
AI, Deep Learning, Research Science
Report
Report to Stakeholders
Data Visualization, Executive Summary, Detailed Analysis/Conclusions, Storytelling with Data
Choose Format Option: Formal report, Live/interactive dashboard, Minimum-viable-product on-the-fly quick-n-dirty one sheet summary
Data Science Certificates
IBM Data Science Professional Certificate
What is Data Science?
Data Science Tools
Methodology
Python
Python Project
Databases and SQL
Data Analysis
Data Visualization
Machine Learning
Capstone
Data Science Capstone Project
Winning the Space Race with Data Science
Data was collected and analyzed to understand the nature of the first stage landing success rate for the SpaceX Falcon 9 rocket.
This data was used to build data visualizations (static plots, interactive maps, and an interactive dashboard).
Machine learning models (Logistic Regression, Support Vector Machine, Decision Tree, and k-Nearest Neighbors) were trained on this data to be used to make predictions about the success of future SpaceX Falcon 9 rocket first stage landings.
Clark Data Science
Data Analytics Data Infrastructure System Organization
I offer targeted, high-quality data analytics, data infrastructure, and system organization services to scientific labs, engineering firms, industrial companies, small businesses, startups, and individuals.
Reach out today!