Descriptive statistics are numerical measures that summarize and describe the main features of a dataset. They provide a way to understand and interpret the data by presenting it in a meaningful and concise manner.

Population and Sample

The first step of every statistical analysis you perform is to determine if the data you are dealing with is a population or a sample.
Population – the collection of all items of interest to our study. Denoted by N. The numbers we gather for a population are called parameters.
Sample – a subset of the population. Denoted by n. The numbers we gather for a sample are called statistics.
Samples are less time consuming and less costly to analyze than populations.
A sample must have both randomness and representativeness for an insight to be precise.
- Randomness – a random sample is collected when each member of the sample is chosen from the population strictly by chance.
- Representativeness – a representative sample is a subset of the population that accurately reflects the members of the entire population.

Types of Data and Measurement Levels

We can classify Data in two main ways:

Types of Data
- Categorical – describes categories or groups. Example: car brands, answers to yes/no questions.
- Numerical – represents numbers.
  - Discrete – can be counted in a finite manner. Example: number of children you want to have, or SAT grades, a number of objects, bank notes, coins.
  - Continuous – infinite and impossible to count. Example: body weight, height, area, distance, time.
Measurement Levels
- Qualitative –
  - Nominal – categories that cannot be ordered, such as car manufacturers, or seasons of the year.
  - Ordinal – consists of groups and categories which follow a strict order. Example: surveys questions that have ranked answers like ‘poor’, ‘fair’, ‘good’, ‘great’, and ‘superior’.
- Quantitative –
  - Interval – Do not have a true zero. Not as common. Example: temperatures in Celsius and Fahrenheit.
  - Ratio – have a true zero. Examples: number of objects, distance, time, temperature in Kelvin.

Categorical Variables – Visualization Techniques

Frequency Distribution Tables
Pie Charts
Bar Charts
Pareto Diagrams – a special kind of bar chart where categories are shown in ascending order of frequency.

Numerical Variables – Frequency Distribution Table

Frequency Distribution Tables can show either the actual number of observations falling in each range or the percentage of observations. In the latter instance, the distribution is called a relative frequency distribution.

Histograms

A histogram is a graph used to represent the frequency distribution of a few data points of one variable. Histograms often classify data into various “bins” or “range groups” and count how many data points belong to each of those bins.