The practice of representing data in graphical form to easily interpret the data. The goal of data visualization is to better communicate the information in the data through visual cues. Visualizing data can also help us easily identify missing values, duplicate values, outliers, incorrect values. Used in data cleaning, dimensionality reduction, checking data distribution.
Exploratory Visualization
Designed to help understand what is in a dataset or to get an idea of what's inside the data. Array of pairwise scatter plots best suited for exploration.
Explanatory Visualization
Designed to convey analysts findings or conclusions to others.
Basic Charts
Bar Chart, Line Chart, Pie Chart, Scatter Plot, Histogram, Box Plots, Heat maps etc are some of the basic charts that we most often use to visualize data.
Pie Chart
Pie chart is best to visualize the percentage breakdown of categories like gender, age and income levels when we want to emphasize the comparison. We can use pie charts for numerical variables after binning them into classes or categories like age, income.
Bar Chart
Bar chart is another way of representing nominal and ordinal data. It can be monthly sales of a company, daily page visits to a blog.
Histogram
This is used to visualize interval and ratio data. A histogram is like a vertical bar chart except there's no space between the bars. the adjacent bars indicate that a numerical range is being summarized by indicating the frequencies in arbitrarily chosen classes.
The chart you want to use depends on what you would like to show through the chart, is it a comparison between two variables, relationship between two variables, distribution of a variable, composition of a variable.
Top Data Visualization Packages in Python
- Matplotlib
- Seaborn
- Plotly
- Bokeh
- GGplot
- Altair
Top Data Visualization Packages in R
- ggplot2
- Lattice